Abstract:
The current machine learning-based safety monitoring models for hydraulic structures lack interpretability in their results. To address this issue, we propose a prediction and interpretation method for dependent variables based on ensemble learning algorithms. We provide a brief description of an improved statistical model and two commonly used ensemble learning algorithms, namely random forest (RF) and extreme gradient boosting tree (XGBoost). Additionally, we introduce the shapley additive explanation method (SHAP) to achieve interpretability in the results of ensemble learning algorithm models. We explain the principles and derivation process of the SHAP method. To verify the effectiveness and practicality of our approach, we utilize deformation data from a super high arch dam during its initial operation period as an example. The results demonstrate that the XGBoost model exhibits high prediction accuracy, with a decision coefficient greater than 0.982 in the prediction set. It is followed by the improved statistical model, while the RF model shows relatively poorer accuracy. The SHAP method effectively isolates the influence of different independent variables on the dependent variable, providing an impact mechanism and enhancing the interpretability of the fitting and prediction results. Our proposed method combines the strengths of both “mechanism driven” and “data driven” approaches, offering valuable insights for the operation and management of major hydraulic structures.