Science Inventory

Ensemble QSAR Modeling to Predict Multispecies Fish Toxicity Lethal Concentrations and Points of Departure

Citation:

Sheffield, T. AND R. Judson. Ensemble QSAR Modeling to Predict Multispecies Fish Toxicity Lethal Concentrations and Points of Departure. ENVIRONMENTAL SCIENCE & TECHNOLOGY. American Chemical Society, Washington, DC, 53(21):12793-12802, (2019). https://doi.org/10.1021/acs.est.9b03957

Impact/Purpose:

This paper presents the development of a set of QSAR models to predict points of departure (PODs) for fish toxicity. EPA currently uses 2 models (TEST and EcoSAR), but both of these models are dated, and do not cover the full range of chemicals and target species used in regulatory testing. The new model has been shown to give more accurate predictions of fish Pods than either TEST or EcoSAR.

Description:

QSAR modeling can be used to aid testing prioritization of the thousands of chemical substances for which no ecological toxicity data is available. We drew on the U.S. Environmental Protection Agency’s ECOTOX database with additional data from ECHA to build a large data set containing in vivo test data on fish for thousands of chemical substances. This was used to create QSAR models to predict two types of points of departure (POD): acute LC50 (median lethal concentration) and endpoints comparable to the NOEC (no observed effect concentration) for any duration (named the “LC50” and “NOEC” models, respectively). These models used study covariates, such as species and exposure route, as features to facilitate the simultaneous use of varied data types. A novel method of substituting taxonomy groups for species dummy variables was introduced to maximize generalizability to different species. A stacked ensemble of three machine learning methods—random forest, gradient boosted trees, and support vector regression—was implemented to best make use of a large data set with many descriptors. The LC50 and NOEC models predicted PODs within one order of magnitude 81% and 76% of the time, respectively, and had RMSEs of roughly 0.83 and 0.98 log10(mg/L), respectively. Benchmarks against the existing TEST and ECOSAR tools suggest improved prediction accuracy.

Record Details:

Record Type:DOCUMENT( JOURNAL/ PEER REVIEWED JOURNAL)
Product Published Date:11/05/2019
Record Last Revised:11/22/2019
OMB Category:Other
Record ID: 347555