Science Inventory

Bacterial and Viral Fecal Indicator Predictive Modeling Using Gradient Boosting at Three Great Lakes Recreational Beach Sites

Citation:

Cyterski, M., O. Shanks, P. Wanjugi, B. McMinn, A. Korajkic, K. Oshima, AND R. Haugland. Bacterial and Viral Fecal Indicator Predictive Modeling Using Gradient Boosting at Three Great Lakes Recreational Beach Sites. WATER RESEARCH. Elsevier Science Ltd, New York, NY, 223:118970, (2022). https://doi.org/10.1016/j.watres.2022.118970

Impact/Purpose:

This manuscript details how predictive statistical models were developed for coliphage and molecular markers at three Great Lakes beach sites from data gathered in the summer of 2015. These water quality measures are thought to be more reliable indicators of health impacts on recreational water users versus traditional cultured fecal indicators, like E.coli and enterococci bacteria. However, there is uncertainty about whether predictive models can be successfully fit to the newer water quality measures. We found that machine learning statistical methods were able to fit the coliphage and molecular marker data nearly as well as traditional fecal indicator bacteria, paving the way for management efforts by state, regional, and local beach managers to use predictive models of these important water quality measures for limiting public exposure to poor water quality.

Description:

Coliphage are viruses that infect Escherichia coli (E. coli) and may indicate the presence of enteric viral pathogens in recreational waters. There is an increasing interest in using these viruses for water quality monitoring and forecasting; however, the ability to use statistical models to predict the concentrations of coliphage, as often done for cultured fecal indicator bacteria (FIB) such as enterococci and E. coli, has not been widely assessed. The same can be said for FIB genetic markers measured using quantitative polymerase chain reaction (qPCR) methods. Here we institute least-angle regression (LARS) modeling of previously published concentrations of cultured FIB (E. coli, enterococci) and coliphage (F+, somatic), along with newly reported genetic concentrations measured via qPCR for E. coli, enterococci, and general Bacteroidales. We develop site-specific models from measures taken at three beach sites on the Great Lakes (Grant Park, South Milwaukee, WI; Edgewater Beach, Cleveland, OH; Washington Park, Michigan City, IN) to investigate the efficacy of a statistical predictive modeling approach. Microbial indicator concentrations were measured in composite water samples collected five days per week over a beach season (∼15 weeks). Model predictive performance (cross-validated standardized root mean squared error of prediction [SRMSEP] and R2PRED) were examined for seven microbial indicators (using log10 concentrations) and water/beach parameters collected concurrently with water samples. Highest predictive performance was seen for qPCR-based enterococci and Bacteroidales models, with F+ coliphage consistently yielding poor performing models. Influential covariates varied by microbial indicator and site. Antecedent rainfall, bird abundance, wave height, and wind speed/direction were most influential across all models. Findings suggest that some fecal indicators may be more suitable for water quality forecasting than others at Great Lakes beaches.

Record Details:

Record Type:DOCUMENT( JOURNAL/ PEER REVIEWED JOURNAL)
Product Published Date:09/01/2022
Record Last Revised:08/28/2023
OMB Category:Other
Record ID: 355857