Science Inventory

Navigating Connectivity Mapping Workflows for Predicting Molecular Targets with Gecco

Citation:

Shah, I., J. Bundy, B. Chambers, L. Everett, D. Haggard, J. Harrill, AND R. Judson. Navigating Connectivity Mapping Workflows for Predicting Molecular Targets with Gecco. Intelligent Systems for Molecular Biology, Madison, WI, July 10 - 14, 2022. https://doi.org/10.23645/epacomptox.25868806

Impact/Purpose:

Presentation to the 30th Conference on Intelligent Systems for Molecular Biology (ISMB) July 2022. Only a small fraction of the 32,898 chemicals in commerce can be evaluated for health effects due to the cost and duration of animal testing. Identifying the putative molecular targets of chemicals by tiered-testing using in vitro assays could aid in determining their hazards and risks more effectively. The US EPA uses high-throughput transcriptomics to efficiently measure the impact of thousands of chemicals on global gene expression in vitro. Computational approaches are required to interpret the putative activation of transcription factors and upstream regulators from transcriptomic data to prioritize chemicals for further testing.

Description:

Connectivity mapping is a powerful approach for relating chemicals, targets, and diseases using transcriptomic data (Lamb 2007). It assumes that transcriptomic profiles can fingerprint biological states (e.g., target activation, pathway perturbation, etc.) using the universal language of genes, and their similarity implies biologically meaningful linkages. Connectivity mapping is widely used in drug discovery (Qu and Rajpal 2012), drug repurposing,(Iorio et al. 2013) and chemical safety.(Smalley, Gant, and Zhang 2010; Lee et al. 2021) Though gene set enrichment analysis (GSEA)(Subramanian et al. 2005) is widely used for connectivity mapping, many refinements and alternative measures of gene set similarity have also been proposed (Musa et al. 2017; Cheng et al. 2014). Moreover, there is a confusing array of connectivity mapping workflows for matching diverse gene signature collections(Liberzon et al. 2011) with transcriptomic reference databases(Lamb 2007; Subramanian et al. 2017) Navigating these workflows is a prerequisite for assessing the targets of thousands of environmental chemicals. We are developing the generalized connectivity toolkit (Gecco) to harmonize disparate connectivity mapping workflows that match a transcriptomic profile (x) (e.g., produced by a perturbagen) with a gene signature (S) associated with a biological target using a similarity measure (SM). As an illustrative example, we used a Gecco workflow to predict estrogen receptor (ESR1/2) activity of reference chemicals from published transcriptomic data in MCF7 cells. Additional details are in the extended abstract. This abstract does not necessarily reflect US EPA policy.   

Record Details:

Record Type:DOCUMENT( PRESENTATION/ EXTENDED ABSTRACT)
Product Published Date:07/14/2022
Record Last Revised:05/21/2024
OMB Category:Other
Record ID: 361510