基于SSA优化CatBoost的甘南草地土壤有机碳含量估算 |
摘要点击 253 全文点击 55 投稿时间:2024-08-11 修订日期:2024-10-11 |
查看HTML全文
查看全文 查看/发表评论 下载PDF阅读器 |
中文关键词 土壤有机碳(SOC) 机器学习 Catboost模型 优化算法 甘南草地 |
英文关键词 soil organic carbon (SOC) machine learning CatBoost model optimization algorithm Gannan grassland |
DOI 10.13227/j.hjkx.202408081 |
|
中文摘要 |
估算甘南藏族自治州草地土壤有机碳(SOC)含量并研究其空间分布特征,明确SOC主要影响因素,对草地质量提升与管理优化、气候调节和生态系统功能维持具有重要意义. 以甘肃省甘南藏族自治州草地为研究对象,通过整合土壤属性、气象因子、高程和植被指数等数据构建多特征因子数据,通过皮尔逊相关性分析筛选出24个显著特征因子,根据SHAP值得到归一化贡献程度. 运用机器学习模型划分8∶2的训练集和测试集,采用十折交叉验证,实验5次得到结果并根据MAE、RMSE和R2等评估模型,采用麻雀搜索算法(SSA)和鲸鱼优化算法(WOA)优化参数并估算SOC含量. 结果表明,基于模型估算的甘南藏族自治州草地表层SOC储量空间分布呈现出从西到东逐渐降低,西北高,东南低的走势,西北部的平均温度相对较低,有机碳含量较高;年平均气温、增强植被指数(EVI)和数字高程模型(DEM)对甘南草地SOC含量的贡献明显,是影响SOC空间分布的主要因素;在随机森林、决策树、梯度提升回归、CatBoost、XGBoost和LightGBM中,CatBoost模型在测试集上的表现最佳;根据SSA和WOA收敛速率曲线,发现SSA收敛更快,更新参数更有效;优化后的SSA-CatBoost模型在预测SOC含量方面表现最佳. SOC空间分布对区域内的生态系统和碳循环有着重要影响,甘南地区西北部草地在土壤肥力和碳储存方面具有更大的潜力,有助于制定更有效的土壤管理和生态保护策略,减缓气候变暖的进程,进一步推动全球生态系统的可持续发展. |
英文摘要 |
Estimating the content of soil organic carbon (SOC) in Gannan Tibetan Autonomous Prefecture, studying its spatial distribution characteristics, and clarifying the main influencing factors of SOC are of great significance for improving grassland quality, optimizing management, regulating climate, and maintaining ecosystem functions. Taking the grassland in Gannan Tibetan Autonomous Prefecture of Gansu Province as the research object, multi-feature factor data were constructed by integrating data such as soil properties, meteorological factors, elevation, and vegetation index, and 24 significant feature factors were screened out using Pearson correlation analysis. Then, the normalized contribution degree was obtained according to the SHAP value. The machine learning model was used to divide the 8∶2 training set and test set, and the results were obtained by ten-fold cross-validation. According to the evaluation models such as MAE, RMSE, and R2, the sparrow search algorithm (SSA) and whale optimization algorithm (WOA) were used to optimize the parameters and estimate the SOC content. The results showed that the spatial distribution of SOC reserves on grassland surface in Gannan Tibetan Autonomous Prefecture based on the model was gradually decreasing from west to east, being high in the northwest and low in the southeast, with relatively low average temperature and high organic carbon content in the northwest. The annual average temperature, enhanced vegetation index (EVI), and digital elevation model (DEM) contributed significantly to the SOC content of Gannan grassland, which were the main factors affecting the spatial distribution of SOC. Among the random forest, decision tree, gradient lifting regression, CatBoost, XGBoost, and LightGBM, the CatBoost model performed best on the test set. According to the convergence rate curves of SSA and WOA, it was found that SSA converged faster, and updating parameters was more effective. The optimized SSA-CatBoost model performed best in predicting SOC content. The spatial distribution of SOC has an important impact on the ecosystem and carbon cycle in the region. The grassland in the northwest of the Gannan region has greater potential in soil fertility and carbon storage, which is helpful to formulate more effective soil management and ecological protection strategies, slow down the process of climate warming, and further promote the sustainable development of the global ecosystem. |
|
|
|