TY - JOUR

T1 - Bayesian inference and predictive performance of soil respiration models in the presence of model discrepancy

AU - Elshall, Ahmed S.

AU - Ye, Ming

AU - Niu, Guo Yue

AU - Barron-Gafford, Greg A.

N1 - Funding Information:
Acknowledgements. The first two authors were supported by the U.S. Department of Energy grant no. DE-SC0008272. The first author was also partly supported by the U.S. National Science Foundation award no. OIA-1557349. The second author was also partly supported by U.S. Department of Energy grant no. DE-SC0019438 and U.S. National Science Foundation grant no. EAR-1552329. We thank the two anonymous reviewers for providing comments that helped to improve the paper.
Publisher Copyright:
© 2019 Author(s).

PY - 2019/5/23

Y1 - 2019/5/23

N2 - Bayesian inference of microbial soil respiration models is often based on the assumptions that the residuals are independent (i.e., no temporal or spatial correlation), identically distributed (i.e., Gaussian noise), and have constant variance (i.e., homoscedastic). In the presence of model discrepancy, as no model is perfect, this study shows that these assumptions are generally invalid in soil respiration modeling such that residuals have high temporal correlation, an increasing variance with increasing magnitude of CO2 efflux, and non-Gaussian distribution. Relaxing these three assumptions stepwise results in eight data models. Data models are the basis of formulating likelihood functions of Bayesian inference. This study presents a systematic and comprehensive investigation of the impacts of data model selection on Bayesian inference and predictive performance. We use three mechanistic soil respiration models with different levels of model fidelity (i.e., model discrepancy) with respect to the number of carbon pools and the explicit representations of soil moisture controls on carbon degradation; therefore, we have different levels of model complexity with respect to the number of model parameters. The study shows that data models have substantial impacts on Bayesian inference and predictive performance of the soil respiration models such that the following points are true: (i) the level of complexity of the best model is generally justified by the cross-validation results for different data models; (ii) not accounting for heteroscedasticity and autocorrelation might not necessarily result in biased parameter estimates or predictions, but will definitely underestimate uncertainty; (iii) using a non-Gaussian data model improves the parameter estimates and the predictive performance; and (iv) accounting for autocorrelation only or joint inversion of correlation and heteroscedasticity can be problematic and requires special treatment. Although the conclusions of this study are empirical, the analysis may provide insights for selecting appropriate data models for soil respiration modeling.

AB - Bayesian inference of microbial soil respiration models is often based on the assumptions that the residuals are independent (i.e., no temporal or spatial correlation), identically distributed (i.e., Gaussian noise), and have constant variance (i.e., homoscedastic). In the presence of model discrepancy, as no model is perfect, this study shows that these assumptions are generally invalid in soil respiration modeling such that residuals have high temporal correlation, an increasing variance with increasing magnitude of CO2 efflux, and non-Gaussian distribution. Relaxing these three assumptions stepwise results in eight data models. Data models are the basis of formulating likelihood functions of Bayesian inference. This study presents a systematic and comprehensive investigation of the impacts of data model selection on Bayesian inference and predictive performance. We use three mechanistic soil respiration models with different levels of model fidelity (i.e., model discrepancy) with respect to the number of carbon pools and the explicit representations of soil moisture controls on carbon degradation; therefore, we have different levels of model complexity with respect to the number of model parameters. The study shows that data models have substantial impacts on Bayesian inference and predictive performance of the soil respiration models such that the following points are true: (i) the level of complexity of the best model is generally justified by the cross-validation results for different data models; (ii) not accounting for heteroscedasticity and autocorrelation might not necessarily result in biased parameter estimates or predictions, but will definitely underestimate uncertainty; (iii) using a non-Gaussian data model improves the parameter estimates and the predictive performance; and (iv) accounting for autocorrelation only or joint inversion of correlation and heteroscedasticity can be problematic and requires special treatment. Although the conclusions of this study are empirical, the analysis may provide insights for selecting appropriate data models for soil respiration modeling.

UR - http://www.scopus.com/inward/record.url?scp=85066094922&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85066094922&partnerID=8YFLogxK

U2 - 10.5194/gmd-12-2009-2019

DO - 10.5194/gmd-12-2009-2019

M3 - Article

AN - SCOPUS:85066094922

SN - 1991-959X

VL - 12

SP - 2009

EP - 2032

JO - Geoscientific Model Development

JF - Geoscientific Model Development

IS - 5

ER -