Science Inventory

MIPHENO: Data normalization for high throughput metabolic analysis.

Citation:

Bell, S., L. BURGOON, AND R. Last. MIPHENO: Data normalization for high throughput metabolic analysis. BMC Bioinformatics. BioMed Central Ltd, London, Uk, 13(10):doi:1186-1471, (2012).

Impact/Purpose:

Here we describe MIPHENO (Mutant Identification by Probabilistic High throughput-Enabled Normalization), an approach for normalizing quantitative first-pass screening data without the need for explicit in-group controls. This approach includes a quality control step and facilitates cross-experimental comparisons that decrease the false non-discovery rates, while maintaining the high accuracy needed to limit false positives in first-pass screening.

Description:

High throughput methodologies such as microarrays, mass spectrometry and plate-based small molecule screens are increasingly used to facilitate discoveries from gene function to drug candidate identification. These large-scale experiments are typically carried out over the course of months and years, often without controls needed to directly compare across the dataset. Few methods are available to facilitate comparisons of high throughput metabolic data generated in batches where explicit in-group controls for normalization are lacking. Results Here we describe MIPHENO (Mutant Identification by Probabilistic High throughput-Enabled Normalization), an approach for normalizing quantitative first-pass screening data without the need for explicit in-group controls. This approach includes a quality control step and facilitates cross-experimental comparisons that decrease the false non-discovery rates, while maintaining the high accuracy needed to limit false positives in first-pass screening. Results from simulation show an improvement in area under the receiver operator characteristic curve of 0.955 for MIPHENO vs 0.923 for a group-based statistic (z-score). A decrease in the false non-discovery rate and an increase in accuracy were also observed across a variety of population parameters while permitting cross dataset comparison. Analysis of the high throughput phenotypic data from the Arabidopsis Chloroplast 2010 Project (http://www.plastid.msu.eduJ) showed ~ 4-fold increase in the ability to detect previously described or expected phenotypes over the group based statistic.

URLs/Downloads:

BMC BIOINFORMATICS   Exit EPA's Web Site

Record Details:

Record Type:DOCUMENT( JOURNAL/ PEER REVIEWED JOURNAL)
Product Published Date:01/13/2012
Record Last Revised:01/22/2013
OMB Category:Other
Record ID: 233163