Science Inventory

Prediction of pesticide acute toxicity using two-dimensional chemical descriptors and target species classification

Citation:

Martin, T., C. Lilavois, AND M. Barron. Prediction of pesticide acute toxicity using two-dimensional chemical descriptors and target species classification. SAR AND QSAR IN ENVIRONMENTAL RESEARCH. Taylor & Francis, Inc., Philadelphia, PA, 28(6):525-539, (2017). https://doi.org/10.1080/1062936X.2017.1343204

Impact/Purpose:

Previous modelling of the median lethal dose (oral rat LD50) has indicated that local class-based models yield better correlations than global models. Linear discriminant analysis (LDA) models were able to predict these indicators with about 87% accuracy. Toxicity is predicted utilizing the QSAR model fit to chemicals with that indicator. Toxicity was also predicted using a global hierarchical clustering (HC) approach which divides data set into clusters based on molecular similarity. At a comparable prediction coverage (~94%), the global HC method yielded slightly higher prediction accuracy (r2 = 0.50) than the LDA method (r2 ~ 0.47). A single model fit to the entire training set yielded the poorest results (r2 = 0.38), indicating that there is an advantage to clustering the dataset to predict acute toxicity. Finally, this study shows that while dividing the training set into subsets (i.e. clusters) improves prediction accuracy, it may not matter which method (expert based or purely machine learning) is used to divide the dataset into subsets. This information is of interest to Regional and Program Office decision makers.

Description:

Previous modelling of the median lethal dose (oral rat LD50) has indicated that local class-based models yield better correlations than global models. We evaluated the hypothesis that dividing the dataset by pesticidal mechanisms would improve prediction accuracy. A linear discriminant analysis (LDA) based-approach was utilized to assign indicators such as the pesticide target species, mode of action, or target species - mode of action combination. LDA models were able to predict these indicators with about 87% accuracy. Toxicity is predicted utilizing the QSAR model fit to chemicals with that indicator. Toxicity was also predicted using a global hierarchical clustering (HC) approach which divides data set into clusters based on molecular similarity. At a comparable prediction coverage (~94%), the global HC method yielded slightly higher prediction accuracy (r2 = 0.50) than the LDA method (r2 ~ 0.47). A single model fit to the entire training set yielded the poorest results (r2 = 0.38), indicating that there is an advantage to clustering the dataset to predict acute toxicity. Finally, this study shows that whilst dividing the training set into subsets (i.e. clusters) improves prediction accuracy, it may not matter which method (expert based or purely machine learning) is used to divide the dataset into subsets.

URLs/Downloads:

https://doi.org/10.1080/1062936X.2017.1343204   Exit

http://www.tandfonline.com/doi/full/10.1080/1062936X.2017.1343204   Exit

Record Details:

Record Type: DOCUMENT (JOURNAL/PEER REVIEWED JOURNAL)
Product Published Date: 07/13/2017
Record Last Revised: 05/11/2018
OMB Category: Other
Record ID: 338812

Organization:

U.S. ENVIRONMENTAL PROTECTION AGENCY

OFFICE OF RESEARCH AND DEVELOPMENT

NATIONAL RISK MANAGEMENT RESEARCH LABORATORY

LAND AND MATERIALS MANAGEMENT DIVISION

EMERGING CHEMISTRY AND ENGINEERING BRANCH