TY - JOUR
T1 - Advancing Satellite Precipitation Retrievals With Data Driven Approaches
T2 - Is Black Box Model Explainable?
AU - Li, Zhi
AU - Wen, Yixin
AU - Schreier, Mathias
AU - Behrangi, Ali
AU - Hong, Yang
AU - Lambrigtsen, Bjorn
N1 - Funding Information:
We akcowledge NASA Oklahoma EPSCoR Research Initiation Grant and Multidisciplinary Online Training Program in Spring 2020 on Big Data, High Performance Computing and Atmospheric Sciences Information. The authors would like to appreciate the efforts done by NOAA NSSL team for making the MRMS data accessible. The first author is sponsored by University of Oklahoma Hydrology and Water Security (HWS) program (https://www.ouhydrologyonline.com/) and Graduate College Hoving Fellowship.
Funding Information:
We akcowledge NASA Oklahoma EPSCoR Research Initiation Grant and Multidisciplinary Online Training Program in Spring 2020 on Big Data, High Performance Computing and Atmospheric Sciences Information. The authors would like to appreciate the efforts done by NOAA NSSL team for making the MRMS data accessible. The first author is sponsored by University of Oklahoma Hydrology and Water Security (HWS) program ( https://www.ouhydrologyonline.com/ ) and Graduate College Hoving Fellowship.
Publisher Copyright:
© 2020. The Authors.
PY - 2021/2
Y1 - 2021/2
N2 - Satellite-based precipitation retrieval is an essential and long-standing scientific problem. With an increase of observational satellite data, the advances of data-driven approaches such as machine learning (ML)/deep learning (DL) are favored to deal with large data sets and potentially improve the accuracy of precipitation estimates. In this study, we took advantage of new technologies by wrapping up a ML/DL-based model pipeline (LinkNet segmentation + tree ensemble). This approach is applied to the Advanced Microwave Sounding Unit (AMSU) on National Oceanic and Atmospheric Administration 18 and 19 flight, and compared with the MultiRadar MultiSensor. Four simulations were configured to examine the performance gain by incorporating three components: (1) precipitation identification, (2) nonlocal features, and (3) precipitation classification. More importantly, we examined the interpretability of the “black box” model to get a better understanding of the underlying physical connections. First, the results by this model pipeline suggest the advantages of the ML model by reducing the systematic error and instantaneous error to a factor of two. Second, identifying precipitation pixels helps to reduce the systematic error by 130%, and predicting precipitation classification benefits improved correlations by 32%. Last, channels at higher frequencies (beyond 150 GHz) are favored to identify precipitation regions, and also channels at 89 and 150 GHz are ranked as the two most important features to precipitation retrieval. This study explores the potentials of AMSU precipitation estimates with ML algorithms and provides means of interpreting the models to facilitate the better prediction of precipitation.
AB - Satellite-based precipitation retrieval is an essential and long-standing scientific problem. With an increase of observational satellite data, the advances of data-driven approaches such as machine learning (ML)/deep learning (DL) are favored to deal with large data sets and potentially improve the accuracy of precipitation estimates. In this study, we took advantage of new technologies by wrapping up a ML/DL-based model pipeline (LinkNet segmentation + tree ensemble). This approach is applied to the Advanced Microwave Sounding Unit (AMSU) on National Oceanic and Atmospheric Administration 18 and 19 flight, and compared with the MultiRadar MultiSensor. Four simulations were configured to examine the performance gain by incorporating three components: (1) precipitation identification, (2) nonlocal features, and (3) precipitation classification. More importantly, we examined the interpretability of the “black box” model to get a better understanding of the underlying physical connections. First, the results by this model pipeline suggest the advantages of the ML model by reducing the systematic error and instantaneous error to a factor of two. Second, identifying precipitation pixels helps to reduce the systematic error by 130%, and predicting precipitation classification benefits improved correlations by 32%. Last, channels at higher frequencies (beyond 150 GHz) are favored to identify precipitation regions, and also channels at 89 and 150 GHz are ranked as the two most important features to precipitation retrieval. This study explores the potentials of AMSU precipitation estimates with ML algorithms and provides means of interpreting the models to facilitate the better prediction of precipitation.
KW - deep learning
KW - interpretable machine learning
KW - precipitation estimation
UR - http://www.scopus.com/inward/record.url?scp=85100132566&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85100132566&partnerID=8YFLogxK
U2 - 10.1029/2020EA001423
DO - 10.1029/2020EA001423
M3 - Article
AN - SCOPUS:85100132566
VL - 8
JO - Earth and Space Science
JF - Earth and Space Science
SN - 2333-5084
IS - 2
M1 - e2020EA001423
ER -