Science Inventory

An automated framework for compiling and integrating chemical hazard data

Citation:

Vegosen, L. AND Todd M. Martin. An automated framework for compiling and integrating chemical hazard data. CLEAN TECHNOLOGIES AND ENVIRONMENTAL POLICY. Springer-Verlag, New York, NY, 22:441-458, (2020). https://doi.org/10.1007/s10098-019-01795-w

Impact/Purpose:

Efforts such as RapidTox and chemical prioritization under TSCA require toxicity data for a large number of endpoints. For most chemicals, there are serious data gaps for these endpoints. In order to fill these data gaps an approach was developed to mine publically available data from hazardous chemical lists, Globally Harmonized System (GHS) hazard codes (H-codes) or hazard categories from government health agencies, experimental quantitative toxicity values, and predicted values using quantitative structure activity relationship (QSAR) models. The advantage of utilizing sources such as GHS hazard codes, is that these values have been reviewed by scientists in other environmental agencies from around the world. QSAR model predictions were obtained using EPA’s Toxicity Estimation Software Tool (T.E.S.T.). The hazard database developed for this effort may contribute to several cheminformatics, public health, and environmental activities.

Description:

Comparative chemical hazard assessment, which compares hazards for several endpoints across several chemicals, can be used for a variety of purposes including alternatives assessment and the prioritization of chemicals for further assessment. A new framework was developed to compile and integrate chemical hazard data for several human health and ecotoxicity endpoints from public online sources including hazardous chemical lists, Globally Harmonized System hazard codes (H-codes) or hazard categories from government health agencies, experimental quantitative toxicity values, and predicted values using Quantitative Structure–Activity Relationship (QSAR) models. QSAR model predictions were obtained using EPA’s Toxicity Estimation Software Tool. Java programming was used to download hazard data, convert data from each source into a consistent score record format, and store the data in a database. Scoring criteria based on the EPA’s Design for the Environment Program Alternatives Assessment Criteria for Hazard Evaluation were used to determine ordinal hazard scores (i.e., low, medium, high, or very high) for each score record. Different methodologies were assessed for integrating data from multiple sources into one score for each hazard endpoint for each chemical. The chemical hazard assessment (CHA) Database developed in this study currently contains more than 990,000 score records for more than 85,000 chemicals. The CHA Database and the methods used in its development may contribute to several cheminformatics, public health, and environmental activities.

Record Details:

Record Type:DOCUMENT( JOURNAL/ PEER REVIEWED JOURNAL)
Product Published Date:01/21/2020
Record Last Revised:04/22/2021
OMB Category:Other
Record ID: 350767