Science Inventory

The Importance of Normalization on Large and Heterogeneous Microarray Datasets

Citation:

PERKINS, E., T. HABIB, L. BURGOON, S. EDWARDS, F. FALCIANI, A. LOGUINOV, DAN VILLENEUVE, C. VULPE, AND N. GARCIA-REYERO. The Importance of Normalization on Large and Heterogeneous Microarray Datasets. Presented at Society of Environmental Toxicology and Chemistry, Boston, MA, November 13 - 17, 2011.

Impact/Purpose:

To document research results

Description:

DNA microarray technology is a powerful functional genomics tool increasingly used for investigating global gene expression in environmental studies. Microarrays can also be used in identifying biological networks, as they give insight on the complex gene-to-gene interactions, networks and pathways, thereby enabling the exploration and examination of how chemicals cause toxicity. Gene expression analysis is a multi-step process and there are many sources contribute to systematic variations that can affect the measured gene expression levels. Normalization is one step to minimize the systematic variations in the measured gene expression levels. Appropriate normalization procedures must be implemented so that the expression levels can be effectively compared across biological samples within an experiment and between different experiments. In this study we used the dataset composed of 1,472 single color 15k Agilent arrays from different experiments. All experiments were performed in the same laboratory, on the same tissue (fathead minnow ovary), and with a range of treatments for which we can hypothesize the original assumptions to be correct. Non-linear normalization methods: quantile normalization and a slightly modified cyclic loess normalization, fastlo normalization. We applied two network inference algorithms, based on mutual information, Accurate Cellular Networks (ARACNE) and Context Likelihood of Relatedness (CLR) to infer the network model from combining the datasets. Results indicated that fastlo was found to be best normalization as the correlations between interacting genes were enhanced and models obtained from combined datasets revealed that the networks were associated with specific biological processes or potential relevance for ovary biology.

URLs/Downloads:

5543VILLENEUVE.PDF   (PDF,NA pp, 5 KB,  about PDF)

Record Details:

Record Type: DOCUMENT (PRESENTATION/ABSTRACT)
Product Published Date: 11/13/2011
Record Last Revised: 12/20/2012
OMB Category: Other
Record ID: 236330

Organization:

U.S. ENVIRONMENTAL PROTECTION AGENCY

OFFICE OF RESEARCH AND DEVELOPMENT

NATIONAL HEALTH AND ENVIRONMENTAL EFFECTS RESEARCH LABORATORY

MID-CONTINENT ECOLOGY DIVISION

TOXIC EFFECTS CHARACTERIZATION RESEARCH