Science Inventory

Literature-mining and Transcriptomic Stress Response Annotation of a Large Chemical Database

Citation:

Chambers, B., L. Taylor, N. Baker, R. Judson, AND I. Shah. Literature-mining and Transcriptomic Stress Response Annotation of a Large Chemical Database. Society of Toxicology 61st Annual Meeting and ToxExpo 2022, San Diego, CA, March 27 - 31, 2022. https://doi.org/10.23645/epacomptox.20387103

Impact/Purpose:

Poster presented to the Society of Toxicology 61st Annual Meeting and ToxExpo March 2022. This presentation will inform the community of advances in automated reference chemical annotation using mixed transcriptomic and search mining approaches. Critital outputs are a literature search tool and a case study illustrating its usefulness in the context of mining stress response active chemicals, and the cell line dependencies of stress response pathway activity.

Description:

Stress response pathways (SRPs) have been implicated in a range of human health conditions ranging from drug-induced liver injury (DILI) to neurodevelopmental disorders suggesting utility as SRP biomarkers for chemical screening. Developing SRP biomarkers requires a diverse body of annotated chemicals to resolve stress pathway crosstalk which can reduce signature specificity. To support this need, we describe a novel method to identify SRP reference transcriptomic profiles by integrating literature mining with transcriptomic analysis. We queried 4761 chemicals from the Library of Integrated Network-Based Cellular Signatures (LINCS) transcriptomic database against SRP bioactivity search phrases on the National Center for Biotechnology Information’s PubMed. We quantified the strength of each chemical’s association with SRP activity in abstracts using pairwise mutual information (PMI) scores derived from PubMed search results. This analysis resulted in the identification of 1806 putative SRP activating chemicals. We next manually evaluated a validation subset of 93 high confidence chemicals. We found 70% agreement between literature and transcriptomic bioactivity scores derived from gene set enrichment analysis. Protein misfolding stress matched with near 100% accuracy while oxidative stress was more varied with only 30% accuracy. We extended the analysis, identifying 340 chemicals by literature PMI scoring that clustered transcriptomically highlighting the potential of the approach in automating reference chemical assignment. Further, we examined transcriptomic bioactivity dependence on cell lines and found that protein stress activity was homogenous between cell lines while DNA damage was heavily heterogeneous, indicating that SRP markers may not be applied indiscriminately of cell type. The result of this study is a database SRP active chemical from which SRP signatures might be tuned to better assay stress pathway bioactivity. This abstract does not necessarily represent U.S. EPA policy.    

Record Details:

Record Type:DOCUMENT( PRESENTATION/ POSTER)
Product Published Date:03/31/2022
Record Last Revised:07/27/2022
OMB Category:Other
Record ID: 355348