Science Inventory

An Integrative data mining approach to identifying Adverse Outcome Pathway (AOP) Signatures

Citation:

Oki, N. AND S. Edwards. An Integrative data mining approach to identifying Adverse Outcome Pathway (AOP) Signatures. TOXICOLOGY. Elsevier Science Ltd, New York, NY, 350:49-61, (2016).

Impact/Purpose:

Adverse Outcome Pathways can expand and enhance the use of ToxCast and other in vitro toxicity information by providing a mechanistic link to adverse outcomes of regulatory concern, but the expert-driven development of AOPs is labor-intensive and time-consuming. This work is intended to complement that process by creating a broad array of computationally-predicted AOPs (cpAOPs) by data mining of publicly available data. This increases the coverage of AOPs and provides minimal information in many cases where nothing is available otherwise. It also provides a starting point for expert-driven development of AOPs and thereby potentially accelerating that process.

Description:

The Adverse Outcome Pathway (AOP) framework is a tool for making biological connections and summarizing key information across different levels of biological organization to connect biological perturbations at the molecular level to adverse outcomes for an individual or population. Computational approaches to explore and determine these connections can accelerate the assembly of AOPs. By leveraging the wealth of publicly available data covering chemical effects on biological systems, computationally-predicted AOPs (cpAOPs) were assembled via data mining of high-throughput screening (HTS) in vitro data, in vivo data and other disease phenotype information. Frequent Itemset Mining (FIM) was used to find associations between the gene targets of ToxCast HTS assays and disease data from Comparative Toxicogenomics Database (CTD) by using the chemicals as the common aggregators between datasets. The method was also used to map gene expression data to disease data from CTD. A cpAOP network was defined by considering genes and diseases as nodes and FIM associations as edges. This network contained 18,283 gene to disease associations for the ToxCast data and 110,253 for CTD gene expression. Two case studies show the value of the cpAOP network by extracting subnetworks focused either on fatty liver disease or the Aryl Hydrocarbon Receptor (AHR). The subnetwork surrounding fatty liver disease included many genes known to play a role in this disease. When querying the cpAOP network with the AHR gene, an interesting subnetwork including glaucoma was identified. While substantial literature exists to support the potential for AHR ligands to elicit glaucoma, it was not explicitly captured in the public annotation information in CTD. The subnetwork from this analysis suggests a cpAOP that includes changes in CYP1B1 expression. These case studies highlight the value in integrating multiple data sources when defining cpAOPs for HTS data.

Record Details:

Record Type:DOCUMENT( JOURNAL/ PEER REVIEWED JOURNAL)
Product Published Date:03/28/2016
Record Last Revised:11/22/2017
OMB Category:Other
Record ID: 336737