TY - JOUR
T1 - A Robust Strategy to Account for Data Sampling Variability in the Development of Hydrological Models
AU - Zheng, Feifei
AU - Chen, Junyi
AU - Ma, Yiyi
AU - Chen, Qiuwen
AU - Maier, Holger R.
AU - Gupta, Hoshin
N1 - Publisher Copyright:
© 2023. American Geophysical Union. All Rights Reserved.
PY - 2023/3
Y1 - 2023/3
N2 - It is typical to use a single portion of the available data to calibrate hydrological models, and the remainder for model evaluation. To minimize model-bias, this partitioning must be performed so as to ensure distributional representativeness and mutual consistency. However, failure to account for data sampling variability (DSV) in the underlying Data Generating Process can weaken the model's generalization performance. While “K-fold cross-validation” can mitigate this problem, it is computationally inefficient since the calibration/evaluation operations must be repeated numerous times. This paper develops a general strategy for stochastic evolutionary parameter optimization (SEPO) that explicitly accounts for DSV when calibrating a model using any population-based evolutionary optimization algorithm (EOA), such as Shuffled Complex Evolution (SCE). Inspired in part by the machine-learning strategy of stochastic gradient descent (SGD), we use various representative random sub-samples to drive the EOA toward the distribution of the model parameters. Unlike in SGD, derivative information is not required and hence SEPO can be applied to any hydrological model where such information is not readily available. To demonstrate the effectiveness of the proposed strategy, we implement it within the well-known SCE, to calibrate the GR4J conceptual rainfall-runoff model to 163 hydro-climatically diverse catchments. Using only a single optimization run, our Stochastic SCE method converges to population-based estimates of model parameter distributions (and corresponding simulation uncertainties), without compromising model performance during either calibration or evaluation. Further, it effectively reduces the need to perform independent evaluation tests of model performance under conditions that are represented by the available data.
AB - It is typical to use a single portion of the available data to calibrate hydrological models, and the remainder for model evaluation. To minimize model-bias, this partitioning must be performed so as to ensure distributional representativeness and mutual consistency. However, failure to account for data sampling variability (DSV) in the underlying Data Generating Process can weaken the model's generalization performance. While “K-fold cross-validation” can mitigate this problem, it is computationally inefficient since the calibration/evaluation operations must be repeated numerous times. This paper develops a general strategy for stochastic evolutionary parameter optimization (SEPO) that explicitly accounts for DSV when calibrating a model using any population-based evolutionary optimization algorithm (EOA), such as Shuffled Complex Evolution (SCE). Inspired in part by the machine-learning strategy of stochastic gradient descent (SGD), we use various representative random sub-samples to drive the EOA toward the distribution of the model parameters. Unlike in SGD, derivative information is not required and hence SEPO can be applied to any hydrological model where such information is not readily available. To demonstrate the effectiveness of the proposed strategy, we implement it within the well-known SCE, to calibrate the GR4J conceptual rainfall-runoff model to 163 hydro-climatically diverse catchments. Using only a single optimization run, our Stochastic SCE method converges to population-based estimates of model parameter distributions (and corresponding simulation uncertainties), without compromising model performance during either calibration or evaluation. Further, it effectively reduces the need to perform independent evaluation tests of model performance under conditions that are represented by the available data.
KW - data sampling variability
KW - hydrological model
KW - model calibration
KW - stochastic gradient descent
KW - uncertainty analysis
UR - http://www.scopus.com/inward/record.url?scp=85152517154&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85152517154&partnerID=8YFLogxK
U2 - 10.1029/2022WR033703
DO - 10.1029/2022WR033703
M3 - Article
AN - SCOPUS:85152517154
SN - 0043-1397
VL - 59
JO - Water Resources Research
JF - Water Resources Research
IS - 3
M1 - e2022WR033703
ER -