融合SHAP和TSO-XGBoost模型的水路货运量预测

Waterway freight volume prediction based on SHAP and TSO-XGBoost model

  • 摘要: 水路货运量需求受诸多因素影响,长江干线中游“645”工程实施后,航道通航条件得到了明显改善,为了更好分析工程实施后货运量变化趋势,提出一种新的水路货运量预测模型。首先,采用二次插值法和KNN反距离权重插值法解决高维面板数据中时间粒度不统一与缺失问题,利用层次聚类和SHAP值的可解释性综合筛选关键影响因素特征序列,降低预测模型输入数据的维度和规模,引入Halton低差异序列和准反射学习策略(QRBL)大幅提升金枪鱼群优化算法(TSO)的寻优效能,增强TSO算法对极限梯度提升(XGBoost)模型中决策树数量、决策树的深度、学习速率等决定模型拟合能力的超参组合寻优效果。结果表明,新模型预测精度显著优于对比模型,可更好地适用于多特征影响因素下的水路货运量预测研究。

     

    Abstract: The demand for waterway freight volume is influenced by numerous factors. Following the implementation of the "645" project in the midstream section of the Yangtze River, the navigation conditions have significantly improved. To better analyze the trend changes in freight volume after the project implementation, this study introduces a novel model for forecasting waterway freight volume. Initially, the quadratic interpolation method and the KNN inverse distance weighting interpolation method are employed to address issues of inconsistency in time granularity and missing data in high-dimensional panel data. By utilizing hierarchical clustering and the interpretability of SHAP values, key influence factor feature sequences are comprehensively screened to reduce the dimensions and scale of input data for the forecasting model. The introduction of the Halton low-discrepancy sequences and the Quasi-Reflective Bayesian Learning (QRBL) strategy substantially enhances the optimization efficiency of the Tuna Swarm Optimization (TSO) algorithm, improving the TSO algorithm's optimization effectiveness of hyperparameter combinations, such as the number of decision trees, the depth of decision trees, and learning rate in the eXtreme Gradient Boosting (XGBoost) model that determine the model's fitting ability. The results indicate that the new model significantly outperforms comparative models in forecasting accuracy, demonstrating better applicability for waterway freight volume forecasting under the influence of multiple feature factors.

     

/

返回文章
返回