||Using 'Found' Data to Augment a Probability Sample: Procedure and Case Study.
Overton, J. M. C. ;
Young, T. C. ;
Overton, W. S. ;
||California Univ., Los Angeles. Dept. of Biology. ;Clarkson Univ., Potsdam, NY. Dept. of Civil and Environmental Engineering. ;Oregon State Univ., Corvallis. Dept. of Statistics.;Corvallis Environmental Research Lab., OR.
Water quality ;
Probability theory ;
Data bases ;
Mathematical models ;
||Most EPA libraries have a fiche copy filed under the call number shown. Check with individual libraries about paper copy.
While the probability sampling has the advantage of permitting unbiased population estimates, many past and existing monitoring schemes do not employ probability sampling. The authors describe and demonstrate a general procedure for augmenting an existing probability sample with data from nonprobability-based surveys ('found' data). The procedure uses sampling frame attributes to group the probability and found samples into similar subsets. Subsequently, the similarity is assumed to reflect the representativeness of the found sample for the matching subpopulation. Two methods of establishing similarity and producing estimates are described: pseudo-random and calibration. The pseudo-random method is used when the found sample can contribute additional information on variables already measured for the probability sample, thus increasing the effective sample size. The calibration method is used when the found sample contributes information that is unique to the found observations. For either approach, the found sample data yield observations that are treated as a probability sample, and population estimates are made according to a probability estimation protocol. To demonstrate these approaches, the authors applied them to found and probability samples of stream discharge data for the southeastern US.