TY - JOUR
T1 - Time series forecasting of Valley fever infection in Maricopa County, AZ using LSTM
AU - Jin, Xueting
AU - Wei, Fangwu
AU - Kandala, Srinivasa Srivatsav
AU - Umesh, Tejas
AU - Steele, Kayleigh
AU - Galgiani, John N.
AU - Laubichler, Manfred D.
N1 - Publisher Copyright:
© 2025 The Author(s)
PY - 2025/3
Y1 - 2025/3
N2 - Background: Coccidioidomycosis (CM), also known as Valley fever, is a respiratory infection. Recently, the number of confirmed cases of CM has been increasing. Precisely defining the influential factors and forecasting future infection can assist in public health messaging and treatment decisions. Methods: We utilized Long Short-Term Memory (LSTM) networks to forecast CM cases, based on the daily pneumonia cases in Maricopa County, Arizona from 2020 to 2022. Besides weather and climate variables, we examined the impact of people's lifestyle change during COVID-19. Factors, including temperature, precipitation, wind speed, PM10 and PM2.5 concentration, drought, and stringency index, were included in LSTM networks, considering their association with CM prevalence, time-lag effect, and correlation with other factors. Findings: LSTM can predict CM prevalence with accurate trend and low mean squared error (MSE). We also found a tradeoff between the length of the forecasting period and the performance of the forecasting model. The models with longer forecasting periods have less accurate trends over time and higher MSEs. Two models with different lengths of forecasting periods, 10 days and 30 days, are identified with good prediction. Interpretation: LSTM algorithms, combined with traditional statistical methods, could help with the forecasting of CM cases. By predicting the CM prevalence, our results can inform researchers, epidemiologists, clinicians, and the public in order to assist public health. Funding: “Getting to the Source of Arizona's Valley Fever Problem: A Tri-University Collaboration to Map and Characterize the Pathogen Where It Grows” funded by the Arizona Board of Regents.
AB - Background: Coccidioidomycosis (CM), also known as Valley fever, is a respiratory infection. Recently, the number of confirmed cases of CM has been increasing. Precisely defining the influential factors and forecasting future infection can assist in public health messaging and treatment decisions. Methods: We utilized Long Short-Term Memory (LSTM) networks to forecast CM cases, based on the daily pneumonia cases in Maricopa County, Arizona from 2020 to 2022. Besides weather and climate variables, we examined the impact of people's lifestyle change during COVID-19. Factors, including temperature, precipitation, wind speed, PM10 and PM2.5 concentration, drought, and stringency index, were included in LSTM networks, considering their association with CM prevalence, time-lag effect, and correlation with other factors. Findings: LSTM can predict CM prevalence with accurate trend and low mean squared error (MSE). We also found a tradeoff between the length of the forecasting period and the performance of the forecasting model. The models with longer forecasting periods have less accurate trends over time and higher MSEs. Two models with different lengths of forecasting periods, 10 days and 30 days, are identified with good prediction. Interpretation: LSTM algorithms, combined with traditional statistical methods, could help with the forecasting of CM cases. By predicting the CM prevalence, our results can inform researchers, epidemiologists, clinicians, and the public in order to assist public health. Funding: “Getting to the Source of Arizona's Valley Fever Problem: A Tri-University Collaboration to Map and Characterize the Pathogen Where It Grows” funded by the Arizona Board of Regents.
KW - Coccidioidomycosis
KW - Deep learning
KW - LSTM
KW - Time series forecasting
KW - Valley fever
UR - https://www.scopus.com/pages/publications/85216899112
UR - https://www.scopus.com/pages/publications/85216899112#tab=citedBy
U2 - 10.1016/j.lana.2025.101010
DO - 10.1016/j.lana.2025.101010
M3 - Article
AN - SCOPUS:85216899112
SN - 2667-193X
VL - 43
JO - The Lancet Regional Health - Americas
JF - The Lancet Regional Health - Americas
M1 - 101010
ER -