Record Display for the EPA National Library Catalog


Main Title Using 'Found' Data to Augment a Probability Sample: Procedure and Case Study.
Author Overton, J. M. C. ; Young, T. C. ; Overton, W. S. ;
CORP Author California Univ., Los Angeles. Dept. of Biology. ;Clarkson Univ., Potsdam, NY. Dept. of Civil and Environmental Engineering. ;Oregon State Univ., Corvallis. Dept. of Statistics.;Corvallis Environmental Research Lab., OR.
Publisher cJul 92
Year Published 1992
Report Number EPA/600/J-94/223;
Stock Number PB94-169984
Additional Subjects Water quality ; Sampling ; Population(Statistics) ; Probability theory ; Estimates ; Data bases ; Mathematical models ; Reprints ;
Library Call Number Additional Info Location Last
NTIS  PB94-169984 Some EPA libraries have a fiche copy filed under the call number shown. 07/26/2022
Collation 20p
While the probability sampling has the advantage of permitting unbiased population estimates, many past and existing monitoring schemes do not employ probability sampling. The authors describe and demonstrate a general procedure for augmenting an existing probability sample with data from nonprobability-based surveys ('found' data). The procedure uses sampling frame attributes to group the probability and found samples into similar subsets. Subsequently, the similarity is assumed to reflect the representativeness of the found sample for the matching subpopulation. Two methods of establishing similarity and producing estimates are described: pseudo-random and calibration. The pseudo-random method is used when the found sample can contribute additional information on variables already measured for the probability sample, thus increasing the effective sample size. The calibration method is used when the found sample contributes information that is unique to the found observations. For either approach, the found sample data yield observations that are treated as a probability sample, and population estimates are made according to a probability estimation protocol. To demonstrate these approaches, the authors applied them to found and probability samples of stream discharge data for the southeastern US.