Grantee Research Project Results
2002 Progress Report: Using Neural Networks to Create New Indices and Classification Schemes
EPA Grant Number: R829784Title: Using Neural Networks to Create New Indices and Classification Schemes
Investigators: Brion, Gail M.
Current Investigators: Brion, Gail M. , Lingireddy, Srinivasa
Institution: University of Kentucky
EPA Project Officer: Page, Angela
Project Period: July 1, 2002 through June 30, 2005 (Extended to June 30, 2006)
Project Period Covered by this Report: July 1, 2002 through June 30, 2003
Project Amount: $523,938
RFA: Microbial Risk in Drinking Water (2001) RFA Text | Recipients Lists
Research Category: Water , Drinking Water , Human Health
Objective:
The objective of this research project is to use advanced computer modeling techniques to predict enhanced risk, as represented by peak numbers of potential pathogens, in a drinking water source from other more easily measured surrogate parameters.
Progress Summary:
Artificial neural networks (ANNs) have been applied to a database consisting of indicators and enteric viruses recovered from shellfish and their surrounding waters from four countries in Europe. It was shown that artificial neural networks outperformed logistic regression for the classification of viral presence and type, and indicated site-specific differences in the relationships between indicators and potential pathogens. The new bacterial ratio proposed (atypical colonies/total coliform colonies [AC/TC]) has been shown to be superior to fecal coliform and Enterococci levels for indicating inputs of fresh human fecal contamination in case study. Long-term sampling of the Kentucky River for indicators and pathogens has commenced.
The scheduling of the project has changed slightly. Year 1 and Year 3 of the project's objectives have been reordered. Sampling efforts are slightly behind schedule, while modeling efforts from unexpected contributions of outside data are advanced enough to support the original hypothesis presented in the project. Although the modeling efforts on other databases are advanced in schedule, a combination of events has delayed the start of river sampling. Due to delays in construction and fungal contamination, the tissue culture laboratory was not operational until March 2003. The company that had pledged access to the Kentucky River for sampling once per week for 2.5 years was unable to meet its commitment to this project, due to new security regulations that arose as a result of September 11, 2001, and the threat of other terrorist activities. This has resulted in a delay in the initiation of sampling. However, full spectrum analysis of the Kentucky River began in June 2003, and the summer schedule of samples has been doubled to recover from the slow start.
Although there was a late start in sampling, the grant is well underway. Students have been recruited and trained for their assays. The tissue culture laboratory has been completed, and is now in full production. The startup phase of the tissue culture laboratory has established solid connections with Dr. Dan Dahling of the U.S. Environmental Protection Agency (EPA) laboratories in Cincinnati, OH, and has provided another conduit for technical support. Poliovirus stocks have been grown and tittered, and are now in use for initial precision and recovery (IPR) studies. IPR studies have been initiated for enteric viruses, phage, and protozoa. Protozoan and phage IPRs are completed, and the enteric virus study is nearing completion.
The ANN program originally written by Dr. Lingireddy has been modified by Dr. Neelankantan. It is now called "Neurosort," and is easier to use in a Microsoft Windows-based format. A seminar on its use has been provided to graduate students and faculty at the University of Kentucky.
Analysis on the river data by ANNs is impossible at this time, due to the limited amount of data collected. However, there are some interesting trends in the summer data. So far, analysis of 400 mL of Kentucky River water for the presence of male-specific coliphages without enrichment has not been related to the presence of enteric viruses concentrated from the river by conventional filtration elution methods of 100 L. Giardia cysts are always present, but Cryptosporidium oocysts are only present during times when the river turbidity was more than 200. Enterococci and fecal coliforms are highly correlated (P <0.001). The AC/TC ratio (the new indicator under development by this project) is not linearly related to the nontransformed values of bacterial indicators, but is showing the expected decreases during times of rain, which indicates the presence of fresher fecal material. This supports the findings obtained from three other local watersheds. Coprostanol and epicoprostanol are present in detectable amounts in all samples, but the long extraction procedure has these data lagging behind the other more immediately measured analytes. All positive enteric virus flasks are being frozen and archived for future identification by genetic methods.
Extensive progress has been made on applying ANNs to in-house and externally provided databases. Logistic regression has been found to be inferior to ANN analysis for the prediction of the presence and type of enteric virus in shellfish from Europe. Interesting findings from ANN probing of data suggest that relationships between indicators and enteric virus presence can be very site specific. Several improvements have been made to the existing neural network program. A visual interface, along with a user-friendly data preprocessor and a statistical postprocessor, has been added to the existing code. The enhanced user interface allows the researchers to channel their time and efforts in interpreting the results from the ANNs, rather than preparing data for neural network analysis. One additional training scheme and several different transformation functions have been implemented to the existing neural network code. A graphical postprocessing tool has been developed for visual interpretation of the trained neural network weights. This provides an insight into the relative importance of the input variables in a well-trained neural network model.
Future Activities:
We will continue river sampling on a weekly basis, analyze the inhouse databases we have accumulated, conduct comparison studies of ANN analysis against logistic regression and other methods of data analysis on multiparameter databases, investigate the applicability of ANN to bootstrapping and data expansion procedures, and modify the existing Neurosort program to contain new functions and training modalities.
Journal Articles:
No journal articles submitted with this report: View all 21 publications for this projectSupplemental Keywords:
water quality, pathogens, indicators, modeling, artificial neural networks, risk management, remediation, Giardia, Cryptosporidium, enteric viruses, EPA Region 4, alternative disinfection methods, bacteria, drinking water contaminants, early warning, ecological risk, emerging pathogens, environmental monitoring, microbial contamination, microbial pathogens, microbial risk assessment, microbiological organisms, water contaminants, water contamination detection., RFA, Scientific Discipline, Water, Geographic Area, Environmental Chemistry, Ecological Risk Assessment, Ecology and Ecosystems, Drinking Water, Engineering, Chemistry, & Physics, Environmental Engineering, EPA Region, microbial risk assessment, alternative disinfection methods, microbial contamination, environmental monitoring, water contamination detection, region 4, bacteria, microbiological organisms, early warning, microbial pathogens, cryptosporidium , neural networks, emerging pathogens, water quality, ecological risk, drinking water contaminantsProgress and Final Reports:
Original AbstractThe perspectives, information and conclusions conveyed in research project abstracts, progress reports, final reports, journal abstracts and journal publications convey the viewpoints of the principal investigator and may not represent the views and policies of ORD and EPA. Conclusions drawn by the principal investigators have not been reviewed by the Agency.