Office of Research and Development Publications

HTTr Pipeline Code v0.7.5

Citation:

Haggard, D., L. Girardin, J. Witten, B. Vallanat, J. Bundy, I. Shah, AND L. Everett. HTTr Pipeline Code v0.7.5. U.S. Environmental Protection Agency, Washington, DC, 2024.

Impact/Purpose:

Processing high-throughput transcriptomics (HTTr) data is a computationally intensive process that relies on multiple existing open-source tools and standards. This pipeline project serves to ensure reproducibility of these efforts, and to increase efficiency in processing these large data sets on the available compute environments. The primary goal is to develop standardized and stable (i.e., version-controlled and reproducible) data processing pipelines for HTTr. Partners and other regulatory agencies are also moving forward in their adoption of these profiling technologies, and several different proposed analysis workflows are emerging. Thus, public release of these data pipelines is an opportunity to promote harmonization of these methods across organizations. The HTTr pipeline (httrpl) has already been under development for some time. The primary goal of httrpl is to create a flexible, version-controlled, and reproducible data analysis pipeline to process HTTr data generated using targeted RNA-seq, starting from raw fastq files. This sub-product will result in the public release of the httrpl pipeline on the EPA GitHub. This will allow program partners, collaborators, and the public the ability to use the same transcriptomics workflow CCTE has been using for processing targeted RNA-seq data.

Description:

This software package will build on an existing prototype package to provide standardized and flexible data science workflows for pipelining and databasing HTTr results using open source software tools, including Docker containers to ensure reproducibility and increase the flexibility to deploy these tools across multiple compute environments. The software package will be released publicly via the EPA GitHub.

Record Details:

Record Type:DOCUMENT( DATA/SOFTWARE/ RAW CODE/CODE PACKAGE)
Product Published Date:04/30/2024
Record Last Revised:06/18/2024
OMB Category:Other
Record ID: 361830