TY - JOUR
T1 - Cross-Validation Indicates Predictive Models May Provide an Alternative to Indicator Organism Monitoring for Evaluating Pathogen Presence in Southwestern US Agricultural Water
AU - Belias, Alexandra
AU - Brassill, Natalie
AU - Roof, Sherry
AU - Rock, Channah
AU - Wiedmann, Martin
AU - Weller, Daniel
N1 - Funding Information:
We would like to thank Maureen Gunderson, Kayla Ferris, Nathanael Henderson, Julia Muuse, and Sriya Sunil for their help on various aspects of this project. We would also like to acknowledge the various irrigation districts within the Southwest growing region for allowing our team to collect agricultural water samples and providing guidance on sampling locations appropriate for the study. Funding. This study was funded by the Center for Produce Safety (2017CPS09). Data analysis, manuscript preparation, and revision were also partially supported by the National Institute of Environmental Health Sciences, National Institutes of Health (NIH) under award number T32ES007271. This work is supported by the Specialty Crop Research Initiative program, Award Number: 2019-51181-30016 from the USDA National Institute of Food and Agriculture. Any opinions, findings, conclusions, or recommendations expressed in this publication are those of the author(s) and do not necessarily reflect the view of the U.S. Department of Agriculture.
Funding Information:
This study was funded by the Center for Produce Safety (2017CPS09). Data analysis, manuscript preparation, and revision were also partially supported by the National Institute of Environmental Health Sciences, National Institutes of Health
Publisher Copyright:
Copyright © 2021 Belias, Brassill, Roof, Rock, Wiedmann and Weller.
PY - 2021/8/5
Y1 - 2021/8/5
N2 - Pathogen contamination of agricultural water has been identified as a probable cause of recalls and outbreaks. However, variability in pathogen presence and concentration complicates the reliable identification of agricultural water at elevated risk of pathogen presence. In this study, we collected data on the presence of Salmonella and genetic markers for enterohemorrhagic E. coli (EHEC; PCR-based detection of stx and eaeA) in southwestern US canal water, which is used as agricultural water for produce. We developed and assessed the accuracy of models to predict the likelihood of pathogen contamination of southwestern US canal water. Based on 169 samples from 60 surface water canals (each sampled 1–3 times), 36% (60/169) and 21% (36/169) of samples were positive for Salmonella presence and EHEC markers, respectively. Water quality parameters (e.g., generic E. coli level, turbidity), surrounding land-use (e.g., natural cover, cropland cover), weather conditions (e.g., temperature), and sampling site characteristics (e.g., canal type) data were collected as predictor variables. Separate conditional forest models were trained for Salmonella isolation and EHEC marker detection, and cross-validated to assess predictive performance. For Salmonella, turbidity, day of year, generic E. coli level, and % natural cover in a 500–1,000 ft (~150–300 m) buffer around the sampling site were the top 4 predictors identified by the conditional forest model. For EHEC markers, generic E. coli level, day of year, % natural cover in a 250–500 ft (~75–150 m) buffer, and % natural cover in a 500–1,000 ft (~150–300 m) buffer were the top 4 predictors. Predictive performance measures (e.g., area under the curve [AUC]) indicated predictive modeling shows potential as an alternative method for assessing the likelihood of pathogen presence in agricultural water. Secondary conditional forest models with generic E. coli level excluded as a predictor showed < 0.01 difference in AUC as compared to the AUC values for the original models (i.e., with generic E. coli level included as a predictor) for both Salmonella (AUC = 0.84) and EHEC markers (AUC = 0.92). Our data suggests models that do not require the inclusion of microbiological data (e.g., indicator organism) show promise for real-time prediction of pathogen contamination of agricultural water (e.g., in surface water canals).
AB - Pathogen contamination of agricultural water has been identified as a probable cause of recalls and outbreaks. However, variability in pathogen presence and concentration complicates the reliable identification of agricultural water at elevated risk of pathogen presence. In this study, we collected data on the presence of Salmonella and genetic markers for enterohemorrhagic E. coli (EHEC; PCR-based detection of stx and eaeA) in southwestern US canal water, which is used as agricultural water for produce. We developed and assessed the accuracy of models to predict the likelihood of pathogen contamination of southwestern US canal water. Based on 169 samples from 60 surface water canals (each sampled 1–3 times), 36% (60/169) and 21% (36/169) of samples were positive for Salmonella presence and EHEC markers, respectively. Water quality parameters (e.g., generic E. coli level, turbidity), surrounding land-use (e.g., natural cover, cropland cover), weather conditions (e.g., temperature), and sampling site characteristics (e.g., canal type) data were collected as predictor variables. Separate conditional forest models were trained for Salmonella isolation and EHEC marker detection, and cross-validated to assess predictive performance. For Salmonella, turbidity, day of year, generic E. coli level, and % natural cover in a 500–1,000 ft (~150–300 m) buffer around the sampling site were the top 4 predictors identified by the conditional forest model. For EHEC markers, generic E. coli level, day of year, % natural cover in a 250–500 ft (~75–150 m) buffer, and % natural cover in a 500–1,000 ft (~150–300 m) buffer were the top 4 predictors. Predictive performance measures (e.g., area under the curve [AUC]) indicated predictive modeling shows potential as an alternative method for assessing the likelihood of pathogen presence in agricultural water. Secondary conditional forest models with generic E. coli level excluded as a predictor showed < 0.01 difference in AUC as compared to the AUC values for the original models (i.e., with generic E. coli level included as a predictor) for both Salmonella (AUC = 0.84) and EHEC markers (AUC = 0.92). Our data suggests models that do not require the inclusion of microbiological data (e.g., indicator organism) show promise for real-time prediction of pathogen contamination of agricultural water (e.g., in surface water canals).
KW - Arizona
KW - E. coli
KW - Salmonella
KW - agricultural water
KW - predictive modeling
KW - produce safety
UR - http://www.scopus.com/inward/record.url?scp=85119699172&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85119699172&partnerID=8YFLogxK
U2 - 10.3389/frwa.2021.693631
DO - 10.3389/frwa.2021.693631
M3 - Article
AN - SCOPUS:85119699172
SN - 2624-9375
VL - 3
JO - Frontiers in Water
JF - Frontiers in Water
M1 - 693631
ER -