Science Inventory

Exploring ToxCast’s invitroDB : Increasing public accessibility to targeted bioactivity screening data

Citation:

Thunes, C., J. Brown, A. Ko, A. Rashid, N. Sipes, K. Friedman, AND M. Feshuk. Exploring ToxCast’s invitroDB : Increasing public accessibility to targeted bioactivity screening data. SOT, Salt Lake City, UT, March 10 - 14, 2024. https://doi.org/10.23645/epacomptox.25395664

Impact/Purpose:

The ToxCast data pipeline (tcpl) is an open-source R package that stores, manages, curve-fits, and visualizes ToxCast data as well as populating the linked MySQL Database, invitrodb. Recent software and database updates facilitated expansion of data delivery options. These updates make it so that ToxCast data can be utilized within an integrated data landscape by a more diverse group of users for a myriad of toxicology applications. 

Description:

Background and Purpose: Building confidence in new approach methodologies (NAMs) for prioritization and hazard characterization requires bioactivity data to be accessible and easily interpretable. The US Environmental Protection Agency Toxicity Forecaster (ToxCast) program makes in vitro medium- and high-throughput screening assay data publicly available for thousands of chemicals. The ToxCast pipeline (tcpl) is an open-source R package that stores, manages, curve-fits, and visualizes data while populating the linked MySQL database, invitrodb. Tcpl is released in versions at the Comprehensive R Archive Network (CRAN), available here: https://CRAN.R-project.org/package=tcpl. Vignettes describing an introduction to tcpl, database structure, assay annotation, processing, and data retrieval are available for users seeking additional details on the functionality of tcpl. The latest invitrodb release (v4.1, September 2023) includes data from 26 different assay sources, including the TOX21 program, 623 assay platforms with 1,496 normalized assay readouts across a diverse biological space with over 500 mapped gene targets for 9,559 unique chemicals. Ongoing work is focused on augmenting and diversifying how ToxCast data can be used to address toxicology research questions for a heterogenous user group. Methods: The integrated nature of the ToxCast project allows for collaborations and interoperability with other software applications. To accommodate recent software and database releases while supporting a growing ToxCast userbase, accessibility has been expanded within existing tools via updates to the CompTox Chemicals Dashboard (CCD, https://comptox.epa.gov/dashboard), application programming interfaces (APIs), and database downloads. Results: For a novice user, the CCD presents a view of potency and relative efficacy metrics across ToxCast endpoints within the bioactivity module for chemicals of interest. Herein, users can sort, filter, and export results. APIs, currently undergoing additional redevelopment, also provide data for various use cases, including research and applications with user interfaces. Users will be able to avoid large data downloads by accessing invitrodb programmatically via an API. This provides the best read-only solution for users who require more flexibility than the CCD can provide. For users with more customized or programmatic ToxCast data needs, or with a need to use tcpl for their own data processing, setting up a personal instance of invitrodb and interacting with the data directly via tcpl is often the preferred option. In addition to access to all processed levels of data and metadata with tcpl, tcplPlot is a function that allows for interactive yet consistent visualization of concentration-response curves. Critical updates in plotting accommodated not only the application of ten curve-fitting models, but also publication-quality graphs that include toggling consistent y-axis scaling, comparisons across chemicals or endpoints to understand relative response, single-concentration plotting, and inclusion of cautionary flags to aid in interpretation by curve fit behavior. In addition, the function tcplDefine prompts access to invitrodb’s new data dictionary where any database term (table or field name) can be supplied and a definition returned, giving users of all expertise levels the opportunity to improve their understanding of ToxCast. Conclusions: ToxCast provides a standard for consistent and reproducible data pipelining for diverse, targeted bioactivity assay data with readily available documentation and a unified open-source software approach. Recent software and database updates facilitated expansion of data delivery options. These updates make it so that ToxCast data can be utilized within an integrated data landscape by a more diverse group of users for a myriad of toxicology applications. This abstract does not necessarily reflect U.S. EPA policy.  

Record Details:

Record Type:DOCUMENT( PRESENTATION/ POSTER)
Product Published Date:03/14/2024
Record Last Revised:03/12/2024
OMB Category:Other
Record ID: 360705