Science Inventory

OPERA: A free and open source QSAR tool for predicting physicochemical properties and environmental fate endpoints

Citation:

Mansouri, K., Chris Grulke, R. Judson, AND A. Williams. OPERA: A free and open source QSAR tool for predicting physicochemical properties and environmental fate endpoints. Presented at American Chemical Society Spring 2018, New Orleans, LA, March 18 - 22, 2018.

Impact/Purpose:

This study developed robust QSAR models for physicochemical properties and environmental fate endpoints that can be used for regulatory purposes.

Description:

Collecting the chemical structures and data for necessary QSAR modeling is facilitated by available public databases and open data. However, QSAR model performance is dependent on the quality of data and modeling methodology used. This study developed robust QSAR models for physicochemical properties and environmental fate endpoints that can be used for regulatory purposes. Publicly available data were collected from the PHYSPROP database among other sources. These data sets have undergone extensive curation using an in-house automated workflow to enhance the quality of the data. The chemical structures were standardized to “QSAR-ready form” prior to calculation of the molecular descriptors. The modeling procedure was based on the five OECD principles for QSAR models to produce reliable yet simple models. Genetic algorithms were used to select the most pertinent and mechanistically interpretable descriptors (from 2 to 15 with an average of 11 descriptors). The sizes of the modeled datasets varied from 150 chemicals for biodegradability half-life to 14,050 chemicals for logP, with an average of 3222 chemicals across all endpoints. The optimal models were built on randomly selected training sets (75%) and validated using 5-fold cross-validation (CV) and test sets (25%). The CV Q2 of the models varied from 0.72 to 0.95 with an average of 0.86 and an R2 test from 0.71 to 0.96 with an average of 0.82. Modeling and performance details were described in QSAR model reporting format (QMRFs) and validated by the European Commission’s Joint Research Center (JRC) for OECD compliance. All models are delivered as a free, open source/open data application called OPERA (OPEn structure-activity Relationship App) used to predict properties for ~750,000 chemicals. The predicted data are freely available on the EPA’s CompTox Chemistry Dashboard (https://comptox.epa.gov). This work does not reflect U.S. EPA policy

URLs/Downloads:

OPERA_ACS_FINAL.PDF   (PDF,NA pp, 2350.203 KB,  about PDF)

OPERA_ACS_2018 AJW_ABSTRACT_CLEAN.PDF   (PDF,NA pp, 75.925 KB,  about PDF)

Record Details:

Record Type: DOCUMENT (PRESENTATION/SLIDE)
Product Published Date: 03/22/2018
Record Last Revised: 04/11/2018
OMB Category: Other
Record ID: 340233

Organization:

U.S. ENVIRONMENTAL PROTECTION AGENCY

OFFICE OF RESEARCH AND DEVELOPMENT

NATIONAL CENTER FOR COMPUTATIONAL TOXICOLOGY