Science Inventory

Chemotype-Enrichment Workflow: A univariate analysis workflow for exploring chemical feature enrichments across EPA’s ToxCast chemical-assay landscape

Citation:

Lougee, R., A. Richard, AND Chris Grulke. Chemotype-Enrichment Workflow: A univariate analysis workflow for exploring chemical feature enrichments across EPA’s ToxCast chemical-assay landscape. American Chemical Society Spring Meeting, New Orleans,LA, March 18 - 22, 2018. https://doi.org/10.23645/epacomptox.7243166

Impact/Purpose:

Chemotypes are a set of structurally defined chemical features that can be used to describe and categorize chemicals. Toxprints are designed specifically to identify those structures that are important to the environmental, regulatory, and commercial-use chemical space, as well as to represent chemical patterns and properties relevant to toxicity concerns. The public Toxprint Chemotype library and associated publicly available chemotyping software provide an ideal resource for developing a generalized workflow to explore chemical-assay associations in the chemotype plane.

Description:

EPA’s ToxCast library, spanning more than 4000 diverse chemical structures (>8000 including Tox21 chemicals), is designed to cover the environmental toxicity and chemical exposure landscape of interest to EPA. Each ToxCast chemical has been screened in tens to hundreds of in vitro high-throughput screening (HTS) assays, and in vivo toxicity data are available for over 1000 of the chemicals. A principal aim of the ToxCast program is to use data across these different domains to build models for predicting potential toxicity or exposure, and for prioritizing limited testing resources. However, largely because of their chemically and mechanistically diverse contents, ToxCast/Tox21 data sets pose challenges to traditional global structure-activity relationship (SAR) modeling approaches. Codifying local chemistry domains within the inventory, through use of publicly available fingerprinting methods, can serve to amplify SAR signals within these domains. An automated set of command line tools, known as the Chemotype-enrichment workflow (CTEW), has been developed to identify enriched fingerprint features (known as chemotypes) across ToxCast chemicals and corresponding HTS and in vivo assay results. The workflow generates fingerprints (e.g., ToxPrints, MACCS, PubChem) for EPA’s DSSTox database content (>1M structures), as well as newly introduced structures. For each of the >800 ToxCast/Tox21 HTS assay datasets, a fingerprint file of the test set structures was first queried, then a discretized assay endpoint (1,0) vector was used to identify enriched features, handle duplicate assay chemicals, and finally generate an enrichment table, employing statistical thresholds of Odds Ratio >3, Fisher’s Exact p-value <0.05, and a minimum of 3 active chemicals. The approach offers an intuitive, flexible complement to traditional SAR methods, with results that are easily interpreted and anchored to defined chemical features, and that can productively guide more targeted SAR investigations. This abstract does not represent U.S. EPA policy.

Record Details:

Record Type:DOCUMENT( PRESENTATION/ SLIDE)
Product Published Date:03/22/2018
Record Last Revised:12/12/2018
OMB Category:Other
Record ID: 342910