Science Inventory

The EPA Online Database of Experimental and Predicted Data to Support Environmental Scientists (ACS Fall meeting)

Citation:

Williams, A., Chris Grulke, K. Mansouri, J. Smith, J. Foster, M. Krzyzanowski, AND J. Edwards. The EPA Online Database of Experimental and Predicted Data to Support Environmental Scientists (ACS Fall meeting). To be Presented at ACS Fall Meeting, Philadelphia, PA, August 21 - 25, 2016. https://doi.org/10.23645/epacomptox.5181532

Impact/Purpose:

poster presentation at the ACS Fall meeting ENVR session on Computational Chemistry and Toxicology in Chemical Discovery and Assessment (QSAR)

Description:

As part of our efforts to develop a public platform to provide access to predictive models we have attempted to disentangle the influence of the quality versus quantity of data available to develop and validate QSAR models. Using a thorough manual review of the data underlying the well-known EPI Suite software, we developed automated processes for the validation of the data using a KNIME workflow. This includes: approaches to validate different chemical structure representations (e.g. molfile and SMILES), identifiers (chemical names and registry numbers), and methods to standardize the data into QSAR-consumable formats for modeling. Our efforts to quantify and segregate data into various quality categories has allowed us to thoroughly investigate the resulting models developed from these data slices, as well as allowing us to examine whether or not efforts into the development of large high-quality datasets has the expected pay-off in terms of prediction performance. Machine-learning approaches have been applied to create a series of models that have been used to generate predicted physicochemical and environmental parameters for over 700,000 chemicals. These data are available online via the EPA’s iCSS Chemistry Dashboard. This abstract does not reflect U.S. EPA policy.

Record Details:

Record Type:DOCUMENT( PRESENTATION/ POSTER)
Product Published Date:08/22/2016
Record Last Revised:08/24/2016
OMB Category:Other
Record ID: 325110