Office of Research and Development Publications

ROBUST ESTIMATION OF MEAN AND VARIANCE USING ENVIRONMENTAL DATA SETS WITH BELOW DETECTION LIMIT OBSERVATIONS

Citation:

Singh, A. AND J M. Nocerino. ROBUST ESTIMATION OF MEAN AND VARIANCE USING ENVIRONMENTAL DATA SETS WITH BELOW DETECTION LIMIT OBSERVATIONS. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS 60(1-2):69-86, (2002).

Impact/Purpose:

The overall objective of the chemometrics and environmetrics program and this task is to examine and evaluate the statistical procedures and methods used in the measurement or experimentation process and to improve those procedures and methods (if deemed inadequate) by investigating, developing, and evaluating statistical methods, algorithms, and software to reduce data uncertainty. The measurement or experimentation process encompasses: decision objectives and design, sampling design, sampling, experimental design, quality control, data collection, signal processing and data manipulation, data analysis, validation, and decision analysis. Other general objectives of the program are to: evaluate certain existing, developed, or potential performance measurements for information content, relevancy, and cost-effectiveness. The objectives of the sampling research area are to provide the Agency with improved state-of-the-science guidance, strategies, and techniques to more accurately and effectively collect solid particulate field and laboratory subsamples that best represent the extent and degree of contamination at a given site.

Description:

Scientists, especially environmental scientists often encounter trace level concentrations that are typically reported as less than a certain limit of detection, L. Type 1, left-censored data arise when certain low values lying below L are ignored or unknown as they cannot be measured accurately In many environmental quality assurance and quality control (QA/QC), and groundwater monitoring applications of the United States Environmental Protection Agency (U.S. EPA), values smaller than L are not required to be reported. However, practitioners still need to obtain reliable estimates of the population mean, u, and the standard deviation (sd), cy. The problem gets complex when a small number of high concentrations are observed with a substantial number of concentrations below the detection limit. The high outlying values
contaminate the underlying censored sample, leading to distorted estimates of u and a. The U.S. EPA, through the National Exposure Research Laboratory- Las Vegas (NERL-LV), under the Office of Research and Development (ORD), has research interests in developing statistically rigorous robust estimation procedures for contaminated left-censored data sets. Robust estimation procedures based upon a proposed (PROP) influence function are shown to result in reliable estimates of population parameters of mean and sd using contaminated left-censored samples. it is also observed that the robust estimates thus obtained with or without the outliers are in close agreement with the corresponding classical estimates after the removalof outliers. Several classical and robust methods for the estimation of u and a using left-censored (truncated) data sets with potential outliers have been reviewed and evaluated.

Record Details:

Record Type:DOCUMENT( JOURNAL/ PEER REVIEWED JOURNAL)
Product Published Date:01/28/2002
Record Last Revised:12/22/2005
Record ID: 65203