Science Inventory

Representation and Enumeration of UVCB Substances to Enable QSAR Predictions for Substances with Structural Uncertainty (QSAR 2021)

Citation:

Grulke, Chris, A. Williams, B. Meyer, AND A. Richard. Representation and Enumeration of UVCB Substances to Enable QSAR Predictions for Substances with Structural Uncertainty (QSAR 2021). QSAR 2021 International Workshop on QSAR in Environmental and Health Sciences, Virtual, NC, June 07 - 10, 2021. https://doi.org/10.23645/epacomptox.15101832

Impact/Purpose:

Presentation to the QSAR 2021 International Workshop on QSAR in Environmental and Health Sciences June 2021. Many chemicals regulated by the US EPA fall in the class of substances referred to as Unknown, Variable composition, Complex reaction product or Biological origin substances (UVCBs). UVCBs cannot be represented with a single structure and, therefore, pose a significant challenge when attempting to apply QSAR models for in silico risk prioritization. Public chemical databases often include a single representative structure for UVCBs, leading to non¿unique substance¿structure mappings and potentially erroneous estimations of substance properties and activities based on representative structures. Work from research into new storage mechanisms for chemicals that currently cannot be effectively documented using structure drawing tools, e.g. mixtures or UVCBs, will facilitate the linkages between substances in the database and provide improved support for local chemistry database needs within and outside of EPA.

Description:

Many chemicals regulated by the US EPA fall in the class of substances referred to as Unknown, Variable composition, Complex reaction product or Biological origin substances (UVCBs). These chemicals cannot be represented with a single structure and, therefore, pose a significant challenge when attempting to apply QSAR models for in silico risk prioritization. Additionally, public chemical databases often include a single representative structure for UVCBs, leading to non-unique substance-structure mappings and potentially erroneous estimations of substance properties and activities based on representative structures. To enable computational estimation of UVCB properties and activities, particularly for inventories of high-interest to EPA, EPA’s DSSTox project has begun to more thoroughly document the relationships between UVCBs and their potential components through use of manual relationship annotations or Markush/query structures. The use of Markush and query structures within a database presents new challenges, including difficulty with determining substance uniqueness (since Markush lack InChIs), storage format limitations, and inconsistent (or lack of) handling of the representations by different software packages. On the other hand, Markush structures offer powerful capabilities for programmatic structure-enumeration, enabling enhanced search capabilities and detection of related substances, as well as the prediction of the range and boundaries of the properties and activities associated with a UCVB substance. Such predictions, in turn, can help to identify cases where a single, or a small number of manually annotated structure representatives would be insufficient to properly understand the potential risks associated with the UVCB. This abstract does not necessarily represent U.S. EPA policy.

Record Details:

Record Type:DOCUMENT( PRESENTATION/ POSTER)
Product Published Date:06/10/2021
Record Last Revised:08/03/2021
OMB Category:Other
Record ID: 352466