水工结构变形预测模型构建与解释

Forecasting and analysis of deformation in hydraulic structures

  • 摘要: 现有基于机器学习算法的水工结构安全监控模型结果的可解释性较差。为提高安全监控模型的可解释性,发展一种基于集成学习算法的水工结构变形预测模型构建与解释方法。简述改进统计模型及随机森林(RF)、极端梯度提升树(XGBoost)两种常用的集成学习算法,引入沙普利值可加性解释(SHAP)方法实现集成学习算法模型结果的可解释性,阐述SHAP方法的原理和推导过程。以某运行初期特高拱坝变形数据为例验证方法的有效性和实用性。结果表明,XGBoost模型具有较高的预测精度,预测集决定系数大于0.982,改进统计模型精度次之,RF模型精度相对较低;SHAP方法可以分离不同自变量对效应量的影响大小,并能给出全局和局部的影响机制,实现模型拟合和预测结果的可解释性。提出的方法综合了“机理驱动”和“数据驱动”模型的优势,可为水工结构运行管理提供决策参考。

     

    Abstract: The current machine learning-based safety monitoring models for hydraulic structures lack interpretability in their results. To address this issue, we propose a prediction and interpretation method for dependent variables based on ensemble learning algorithms. We provide a brief description of an improved statistical model and two commonly used ensemble learning algorithms, namely random forest (RF) and extreme gradient boosting tree (XGBoost). Additionally, we introduce the shapley additive explanation method (SHAP) to achieve interpretability in the results of ensemble learning algorithm models. We explain the principles and derivation process of the SHAP method. To verify the effectiveness and practicality of our approach, we utilize deformation data from a super high arch dam during its initial operation period as an example. The results demonstrate that the XGBoost model exhibits high prediction accuracy, with a decision coefficient greater than 0.982 in the prediction set. It is followed by the improved statistical model, while the RF model shows relatively poorer accuracy. The SHAP method effectively isolates the influence of different independent variables on the dependent variable, providing an impact mechanism and enhancing the interpretability of the fitting and prediction results. Our proposed method combines the strengths of both “mechanism driven” and “data driven” approaches, offering valuable insights for the operation and management of major hydraulic structures.

     

/

返回文章
返回