You are here:
DESIGNING ENVIRONMENTAL MONITORING DATABASES FOR STATISTIC ASSESSMENT
Hale, S S. AND H Buffum. DESIGNING ENVIRONMENTAL MONITORING DATABASES FOR STATISTIC ASSESSMENT. Presented at EMAP Symposium On Western Ecological Systems, San Francisco CA, April 6-9, 1999.
Databases designed for statistical analyses have characteristics that distinguish them from databases intended for general use. EMAP uses a probabilistic sampling design to collect data to produce statistical assessments of environmental conditions. In addition to supporting the statistical analyses, these data are later made publicly available as summary databases on the EMAP web site for a general audience. As do writers, database designers should target the expected audience. Analytical databases designed to support statistical analyses usually have restricted scope: in time, in geographic extent, and especially in data types and contents, often limited to a particular scientific discipline. Primary users may be the same people who designed the study, who are familiar with both the methods and the data, and who do not need robust metadata, which are so essential for general purpose databases. The interface should be straightforward and efficient. Built-in software tools for viewing and analysis are not needed. Often the analytical database is designed in more of a horizontal than a vertical format to ease loading to statistical analysis software packages. These databases may include information on the statistical design such as the inclusion probability of given data points. All replicates and other data used to estimate sample variance are included. The analytical database may be a subset of study data and data from other sources. These design issues are illustrated using an analytical database for estuaries that supports environmental assessments in a broad region such as the EMAP Western Geographic Study.