Science Inventory

The ECOTOX Knowledgebase Pipeline: From literature search to data extraction AND SeqAPASS tool briefing

Citation:

Lalone, C. The ECOTOX Knowledgebase Pipeline: From literature search to data extraction AND SeqAPASS tool briefing. Interagency Coordinating Committee for Validation of Alternative Methods (ICCVAM) Eco-Workgroup, Duluth, MN, March 06, 2019.

Impact/Purpose:

The first part of this webinar will describe the ECOTOX Knowledgebase. ECOTOX Knowledgebase is a publicly accessible web-based tool that houses curated ecotoxicology data. The tool has been significantly updated in terms of the user interface and the curation pipeline. This presentation will be used to describe the advances made in collecting chemical toxicity data from available sources and those made in exploring the resulting data. Additionally, the presentation will discuss future advances that are being made to allow the ECOTOX Knowledgebase to interact with other EPA tools, including SeqAPASS. The publicly accessible Sequence Alignment to Predict Across Species Susceptibility Tool has increasingly been recognized internationally as a useful tool for understanding species similarities and differences to predict chemical susceptibility across taxa. The intergovernmental organization, The Organisation for Economic Co-operation and Development, has requested a briefing about the developmental status of the SeqAPASS tool and its applications. Updated Guidance documents for identifying chemicals that adversely affect the endocrine system are being drafted that include the use of SeqAPASS for understanding cross species chemical susceptibility. This presentation will serve to inform the international community of the status of this work.

Description:

The US Environmental Protection Agency’s Ecotoxicology (ECOTOX) Knowledgebase contains more than 30 years of reported single chemical toxicity effects data on aquatic and terrestrial organisms. Approximately 900,000 test results covering more than 11,000 chemicals and 12,000 species are available in ECOTOX. While the database is currently used by many sectors for a variety of purposes, a future goal is to allow for computational modeling of the data to identify novel adverse outcome pathways and networks, and assist in predicting chemical hazard and species sensitivity. One obstacle is that ECOTOX captures the study designs and test results using author-reported descriptions, resulting in more than 4000 codes. Relationships among these codes are often not apparent in the current design (e.g., aryl hydrocarbon hydrolase and cytochrome P450 1A), and some codes are uniquely specific to the study of its derivation (e.g., 3rd generation male). To enhance the query capability of the data within and external to the ECOTOX knowledgebase, and to prepare for future computational functionality, the ECOTOX codes were mapped to existing biological ontology classes. A Java-based lookup tool was developed using the ontology browser BioPortal (https://bioportal.bioontology.org/) REST API to semi-automate the code mapping. This tool was designed to allow for batch processing and to make use of BioPortal’s annotator and recommender functions so that all ontological class identifiers relevant to a particular ECOTOX term would be returned and specific ontologies recommended. Using this approach, the majority of the ECOTOX codes were mapped to ontological class identifiers; some terms required multiple identifiers to properly describe them. A set of unmapped terms unique to the ECOTOX database were also identified. Manual curation of the results was also conducted to ensure proper context for the mapped classes. Additionally, advances have been made to make the ECOTOX Knowledgebase interoperable with other EPA tools, including SeqAPASS.Sequence alignment to predict across-species susceptibility (SeqAPASS) is a web-based tool that allows the user to begin to understand how broadly high-throughput screening results or adverse outcome pathway constructs may plausibly be extrapolated across species, while describing the relative intrinsic susceptibiltiy of different taxa to chemicals with known modes of action (e.g., pharmaceuticals and pesticides). The tool rapidly and strategically assesses available molecular target information to describe protein sequence similarity at the primary amino acid sequence, conserved domain, and individual amino acid residue levels. This in silico approach to species extrapolation was designed to automate and streamline the relatively complex and time-consuming process of comparing protein sequences in a consistent, logical, and criteria driven manner intended for predicting across species susceptibility to a chemical perturbation. Over the last year advanvcements have been made to the tool to automate the susceptility prediction for evlauations of individual amino acid residues imporatant for direct interaction with chemicals. Additionally, interactive data visualization capabilites have been incorportated. These new features, along with others in development will enhace interpretation of the results.

Record Details:

Record Type:DOCUMENT( PRESENTATION/ SLIDE)
Product Published Date:03/06/2019
Record Last Revised:03/06/2019
OMB Category:Other
Record ID: 344346