Science Inventory

MIEML: Predicting Molecular Initiating Events by Integrating Machine Learning with Concentration Response Analysis of High Throughput Transcriptomic Chemical Screening Data

Citation:

Bundy, J., J. Rogers, K. Friedman, I. Shah, R. Judson, J. Harrill, AND L. Everett. MIEML: Predicting Molecular Initiating Events by Integrating Machine Learning with Concentration Response Analysis of High Throughput Transcriptomic Chemical Screening Data. Society of Toxicology 62nd Annual Meeting and ToxExpo 2023, Nashville, TN, March 19 - 23, 2023. https://doi.org/10.23645/epacomptox.22732112

Impact/Purpose:

Presentation to the Society of Toxicology 62nd Annual Meeting and ToxExpo March 2023. Extracting mechanistic information from high dimensional biological datasets is a persistent bioinformatic obstacle to the use of NAMs in chemical safety screening. The framework presented in this abstract is helps identify molecular initiating events associated with chemical exposure, from high throughput transcriptomic data sets. This framework supports a tiered testing strategy for chemical safety screening by helping to identify what targeted tier II assays might be helpful in confirming likely mechanisms of action of chemicals that are identified as bioactive in tier I screens.

Description:

The cost of RNA sequencing has decreased considerably since its inception. Simultaneously, the throughput of this technology has improved, and it is now possible to efficiently profile gene expression changes across thousands of samples in a single study. The falling costs and improving scalability of these technologies has contributed to their proposed use in regulatory contexts for chemical risk assessment. However, the analysis and interpretation of large, high dimensional transcriptomic datasets presents a formidable obstacle for utilizing these data in a regulatory decision-making context. As transcriptomic chemical screening data accumulates in the public domain, there is a growing need to develop new bioinformatics approaches for extracting mechanistic insight from these datasets.  Here, we present mieml (Molecular Initiating Event prediction with Machine Learning), a bioinformatic framework for using transcriptomics data to predict the underlying molecular initiating event (MIE) associated with a chemical exposure. We demonstrate the utility of this framework in predicting MIE activation by training models using gene expression profiles from a large bioactivity screen conducted in the breast cancer-derived MCF7 cell line. We trained binary classifiers to predict activation of 20 distinct MIEs using transcriptomic profiles from reference chemicals known to be associated with each MIE. Of these, four sets of MIE classifiers were validated using permutation testing followed by an empirical significance analysis. Validated classifiers were then used to generate MIE activation predictions for 1,784 test chemicals at each concentration. Predictions were then used as inputs for concentration-response modeling using the tcplfit2 R package. Chemicals that showed concentration-responsive MIE prediction scores were identified as candidate activators of that MIE. For further validation, we compared our MIE activation predictions to the ToxCast targeted high-throughput screening assay dataset, revealing general agreement between ToxCast bioactivities and mieml predictions. Specifically, mieml predictions showed concordance with ToxCast data for estrogen receptor agonism, aryl hydrocarbon receptor agonism, and glucocorticoid receptor agonism. A subset of test chemicals were predicted to be bioactive by mieml, but were not captured by the currently available ToxCast data or existing literature evidence, suggesting that mieml has potential to identify uncharacterized MIE activators. However, these require further analysis to discriminate whether these predictions represent false-positives or novel putative MIE activators. These results show that mieml is a reproducible framework for predicting MIE activation from large transcriptomic chemical screens. mieml predictions generally agree with targeted high-throughput screening data, and have potential to identify chemicals with novel MIE activity. Future work will focus on improving methods for validating MIE predictions, as well as extending this framework to other cell lines to identify cell-types which are better suited to predict MIEs that are not well-represented within the MCF7 cell line.  This abstract does not necessarily reflect US EPA policy. Company or product names do not constitute endorsement by US EPA.  

Record Details:

Record Type:DOCUMENT( PRESENTATION/ POSTER)
Product Published Date:03/23/2023
Record Last Revised:05/15/2023
OMB Category:Other
Record ID: 357840