Grantee Research Project Results
2005 Progress Report: National Research Program on Design-Based/Model-Assisted Survey Methodology for Aquatic Resources
EPA Grant Number: R829096Center: Center for Air, Climate, and Energy Solutions
Center Director: Robinson, Allen
Title: National Research Program on Design-Based/Model-Assisted Survey Methodology for Aquatic Resources
Investigators: Stevens, Don L. , Urquhart, N. Scott , Herlihy, Alan T. , Lesser, Virginia
Institution: Oregon State University , Colorado State University
EPA Project Officer: Packard, Benjamin H
Project Period: October 15, 2001 through October 14, 2005 (Extended to October 13, 2006)
Project Period Covered by this Report: October 15, 2004 through October 14, 2005
Project Amount: $2,989,884
RFA: Research Program on Statistical Survey Design and Analysis for Aquatic Resources (2001) RFA Text | Recipients Lists
Research Category: Ecological Indicators/Assessment/Restoration , Tribal Environmental Health Research , Watersheds , Water , Aquatic Ecosystems
Objective:
The objective of this research project is to develop and implement design-based/model-assisted statistical methods for aquatic surveys.
Progress Summary:
Designs and Models for Aquatic Resource Surveys (DAMARS) began its fourth year of collaboration with Oregon Department of Fish and Wildlife (ODFW) and the U.S. Environmental Protection Agency Western Ecology Division (WED) on the sampling of coho salmon. Work continues on the description of a spatial pattern and the imputation of missing data, but there has been an increased emphasis on adapting designs to special circumstances. We have initiated studies that address the modification of a rotating panel design to accommodate changes in the target population or improved frame information. We also are investigating the efficiency of a spatially balanced sample, such as a Generalized Random Tessellation Stratified (GRTS) sample, versus an adaptive sample if the population exhibits some degree of clustering.
The utility of the GRTS design for sampling and monitoring environmental resources is becoming widely recognized by other federal agencies. Director Don L. Stevens, Jr. has attended multiple workshops sponsored by the National Park Service, the Bureau of Land Management, Forest Service, the National Oceanic and Atmospheric Administration (NOAA)-Fisheries, and other federal agencies where the GRTS design has been adopted or is being considered.
During the 4 years of the research program, DAMARS staff has made more than 130 presentations. These include presentations at premier professional meetings such as the Joint Statistical Meetings, Ecological Society of America, Eastern and Western North America Regions of the Biometrics Society, the International Statistical Institute, and The International Environmetrics Society. Staff also have presented seminars at a number of universities, colleges, and high schools, as well as attended and presented at a variety of workshops.
More than 20 papers have been published or are in press in professional journals, federal and state agency publications, and book chapters. Twenty additional papers currently are in review and many others are in various stages of preparation.
DAMARS has established ongoing working relationships with several client agencies that have applied the products of both DAMARS and Space-Time Aquatic Resources Modeling and Analysis Program (STARMAP) research to ongoing programs. The applications include San Francisco Estuary Regional Monitoring Program for Trace Substances (RMP), West Coast Tidal Wetland Monitoring and Assessment Venture (California Rapid Assessment Method [CRAM]), sampling coho salmon in Oregon coastal streams (ODFW and the Oregon Plan for Salmon and Watersheds [OPSW]), and the Great Lakes Environmental Indicators.
Project 1: Direction & Administration
Dr. Stevens, Jr. monitored the progress of Projects 2-5, including oversight of budgets, staffing, and coordination; monitored subcontracts to Colorado State University (CSU), Washington State University, and Iowa State University; assembled and submitted quarterly and annual reports; coordinated matters with STARMAP, the CSU program, and with the program’s U.S. Environmental Protection Agency Project Officer.
At the 2004 Joint Program meeting with STARMAP, the Scientific Advisory Committee (SAC) recommended that the 2005 program meeting be expanded to include a conference and participants outside of DAMARS and STARMAP be invited. Therefore, the 2005 program meeting became the Conference on Statistics for Aquatic Resource Monitoring, Modeling, and Management (SARMMM) (http://oregonstate.edu/dept/statistics/SARMMM/meeting.html Exit ). The conference was organized as a joint meeting of DARMARS, STARMAP, and the OPSW. The International Environmetrics Society (TIES) and the section on statistics and the environment (ENVR) of the American Statistical Association (ASA) cosponsored the meeting. The meeting was an outstanding success. It was attended by approximately 115 persons from a variety of state agencies, tribes, universities, and environmental organizations. There were a variety of presentations by persons from outside DAMARS and STARMAP. The plenary session with OPSW identified a pressing need for continued statistical support of state, federal, and tribal aquatic monitoring programs.
Project 2: Integration and Extramural Outreach
This project has two main emphases: (1) the development of learning materials to aid in transferring the statistical methodology of monitoring, and (2) interactive collaboration with state and tribal agencies. This is a joint project with the STARMAP Program at CSU. STARMAP has focused on learning materials, and DAMARS has concentrated on developing collaborative relationships, although both programs have engaged in both activities.
Development of browser-based learning materials continues. A decision was made to represent all material in pdf format; Stacey Hancock, a statistics graduate student at CSU, is incorporating video clips of the field training into the materials.
Wetlands are included in aquatic systems. Dr. N. Scott Urquhart participated in a review of the methodology being used for the upcoming National Wetlands Status and Trends report. Dr. Urquhart’s review produced other contacts, and a wetlands data set, for which an investigation has just started. Minnesota’s Department of Natural Resources is designing an expanded (over NWI’s 175 plots) sampling of wetlands; Dr. Urquhart is assisting in that design effort. The objective of this effort will be to incorporate costs into decisionmaking about the optimal size of wetland monitoring plots. The preliminary spatial analysis is interesting.
Dr. Urquhart actively recruited high school students into statistics by giving talks to two advanced placement statistics classes in a Fort Collins high school.
Doctoral student Leigh Ann Harrod (Oregon State University [OSU]) developed a manual entitled Ignorable Nonresponse Adjustment Procedures and Algorithms, with an accompanying CD-ROM. The manual guides the user through data analysis for probability-based survey data with nonresponse; provides documentation for the weighting adjustment functions; and provides a copy of the R software.
SFEI has used a GRTS-based rotating panel design for monitoring trace contaminants in the San Francisco estuary since 2002. Dr. Stevens, Jr., worked with personnel at SFEI to analyze data from a variable probability survey design and acquainted them with the R software for survey design currently being developed by Tony Olsen at EPA’s National Health and Environmental Effects Research Lab-Western Ecology Division.
Dr. Stevens, Jr. visited with the mathematics faculty and students at Eastern OSU to interest potential graduate students in environmental statistics. Dr. Stevens, Jr., presented a seminar on “Environmental Monitoring, Statistics, and the Art of Non-Representation: The Need and Evidence for a Paradigm Shift.” He also visited a mathematical modeling class and discussed statistical modeling using the Oregon coastal coho salmon population as an example.
At the 2004 Joint Program Meeting, the Science Advisory Committee (SAC) recommended that the two programs undertake and publicize a large case study. The program’s directors, however, in consultation with program personnel, preferred a special issue of an environmental statistics journal as a more appropriate mechanism. Dr. Stevens, Jr., contacted Abdel El-Shaarawi, editor of Environmetrics, and received a positive response. A tentative publication target of mid-2006 is being considered meaning that papers should be ready for review by early 2006. The concept for the special issue is for a series of papers that address major features of surveys on aquatic resources from design considerations to analysis and presentation concerns.
ODFW/OPSW have questions that could be addressed by the analysis methodology developed by the two programs and a rich collection of data sets for developing and illustrating the methodology, making it possible for many of the papers in the special issue to use ODFW/OPSW data. A number of statistical concerns that are potential topics for the special issue have been identified. Some of these concerns are listed below.
- How to use all available data in an analysis–In many instances, both probability and nonprobability data are available, e.g., the habitat basis surveys or special focus fish studies. How can these types of data be combined?
- How to account for missing data–ODFW has a strong indication that land use and ownership are powerful explanatory variables, and there is a differential in the access rate between the various classes making imputation methods necessary.
- Need for post-stratification–In some cases, data are not missing, but there is a discrepancy between the class proportions of a basin derived from geographic information systems (GIS) and the sample proportions, suggesting a need to post-stratify. What is the best way to do so?
- Small area estimation–The original coho survey was designed to provide answers at five monitoring areas, which were loosely based on evolutionarily significant units. Current thought, however, is that there are more than 30 populations that must be treated as distinct under the Endangered Species Act. Some of these populations have limited geographic extent; the present sampling design provides few observations.
- Design modification and sample reallocation–Is it possible for the present sample to be reallocated to put more sample points in the small populations? What will be the impact on the panel structure? The present design includes an annual panel, a 3-year panel, a 9-year panel, and a 27-year panel. Annual visits may be too frequent for habitat observations. How can the habitat sample be reallocated with minimal impact on site co-location with juvenile and adult samples?
- Plot design–There are several data sets that could be used to address the trade-off between the extent of a particular plot and the number of plots, or to look at local versus large-scale spatial correlation. The data sets include both habitat and fish data.
- Reference sites: How to use them–This relates to the “use of all data question.” Some reference sites were chosen subjectively; some were picked from the probability samples. What is the appropriate way to compare the cumulative distribution function (cdf) of a population estimated from a probability sample to a “reference cdf?” More specifically, what is the “reference cdf?”
- Measurement error biases and impact of varying detection probability–There are a number of data sets that could be used to examine some detection probability questions or, more generally, the impact of measurement errors on population estimates. Also, some metrics (e.g., counts from snorkeling) seem to be consistent at one site, but vary from site to site. This means that repeat observations at the same site but at different times by different crews seem to yield consistent detection probabilities, but probability varies substantially from site to site.
- Multiscale data–ODFW has compiled land use and ownership data, but other GIS coverages are available from CLAMS, or can be created using the GIS tools being developed by Dr. Dave Theobald at CSU. Two questions currently being considered are (1) what are the appropriate coverages, and (2) what are the best ways to use them in conjunction with the site-specific data?
- Misaligned data–Similar to the multiscale data issue, but involve different data sets collected at different sites, possibly the same scale.
- Data display–Ruben Smith has developed several methods for displaying spatial pattern. The SAC suggested that the programs consider other graphical methods of displaying and visualizing data, such as the methods developed by Dan Carr.
- Trend detection–The rotating panel was designed to both quantify current status and detect trend. What are appropriate analyses methods to detect trend? Also, what is the most effective way to account for effects [what kind of effects? global? environmental?], such as changing ocean conditions, watershed management impact assessment? Also, what is the connection between linear model and design-based approaches?
- Status assessment–Current tools include the cdf, which is based on data collected in a single year, but spatial and temporal correlation is present. Ancillary information also is available. How can this information be used to improve the precision of current status estimates?
Dr. Stevens, Jr., has continued working with Core Development Team for the CRAM for wetland condition. The Core Development Team includes representatives from EPA Region 9, SFEI, the Southern California Coastal Water Research Project (SCCWRP), the California Conservation Core, the California Coastal Commission, and University of California, Los Angeles. The CRAM is modeled on Ohio RAM (ORAM) and is being extended to cover wetland types in California, e.g., salt marshes, and wetlands that are tidal influenced. In January, Dr. Stevens, Jr., met with the Core Development Team to discuss approaches to calibrating CRAM. The proposed approach was submitted to EPA Region 9 for approval. Calibration will take place in late 2005 through early 2006.
A full-day course, “Designing Aquatic Resource Surveys” was presented in conjunction with the SARMMM conference at Corvallis, OR in September. The primary audience for both courses was aquatic monitoring practitioners in federal, state, and tribal agencies. The Anchorage course consisted of approximately 15 attendees, and more than 60 persons attended the Corvallis course.
Dr. Stevens, Jr., participated in the redesign of the monitoring plan for San Francisco (SF) Bay as part of a design team consisting of representatives from the SFEI, EPA Region 9, DAMARS, U.S. Geological Survey, and the SF Bay Area Regional Water Resource Control Board. The design is an excellent example of using prior information to guide design. Separate designs were instituted for water column and sediment. The sediment design applies rotating panel GRTS methodology.
Project 3: Survey Design Methodology for Aquatic Resources
Doctoral students Leigh Ann Harrod and William Gaeuman continued work on trend detection methodology for a survey design with multiple temporal panels. Ms. Harrod also is working on the application of quadratic inference functions as a tool for detecting trends in surveys.
Doctoral students Kathi Georgitis and Alix Gitelman are investigating the use of graphical models to address multiscale questions in ecological studies. The conditional modeling approach allows for specification of correlation between variables at the same and multiple scales. This approach also can be used to explore how the presence or absence of a species is related to vegetation characteristics at multiple scales, similar to the Bayesian Hierarchical models.
Dr. Loveday Conquest is collaborating with a scientific team from Washington Sea Grant in addressing sampling issues in the assessment of seabird bycatch in Alaska longline fisheries.
Dr. Conquest completed a review of sampling approaches as part of the monitoring plan regarding the relationship between stream channel networks and sediment delivery from roads to streams. This review was performed under the auspices of the Forest Practices Division of the Washington State Department of Natural Resources.
Postdoctoral fellow Ruben Smith continued work with ODFW personnel and developed prediction maps of the numbers of returning adult coho in Oregon coastal streams for the year 2003. The model incorporated the Oregon coho populations as covariate. This map was used by ODFW in their assessment of the status of The Oregon Plan for Salmon and Watersheds. Dr. Smith continued working on the development and implementation of the spatio-temporal Poisson Linear Geostatistical Model for the coho salmon. Current efforts are underway to incorporate ocean conditions using the Pacific Decadal Oscillation (PDO) index to determine the influence of ocean conditions on salmon populations.
Doctoral student William Gaeuman continued work on trend detection methodology for a survey design with multiple temporal panels. It appears that the covariance between panels in a GRTS design depends on their position along the reverse hierarchically ordered sample points. We are now working on a way to incorporate this information into the variance estimator for the proposed estimator of the trend cdf.
Dr. Stevens, Jr., Dr. Smith, and Mr. Gaeuman continued working with ODFW to address several sampling design issues. The current sample ODFW used to monitor Oregon Coastal coho salmon was created in 1998. Since then, several issues have developed:
- The 1998 sample frame was incomplete; in some cases, as much as 50 percent of the potential salmon spawning habitat was omitted.
- More detailed examination of the coho population structure indicates that there are as many as 30 distinct populations. Many of these are small and do not contain an adequate number of samples under the 1998 design. The issue now is how to augment the sample in these populations with minimal impact on the temporal panel structure.
- The 1998 implementation of the spatially balanced algorithm had a coding error that resulted in some spatial imbalance. The issue here is how to “rebalance” the sample with minimal impact on temporal panel structure.
- The revisit period was selected to coincide with the 3-year life cycle of coho salmon, but that is too frequent for the habitat sample. We need to reallocate some of the habitat sampling effort that goes into annual and 3-year revisits to get more extensive spatial coverage.
Project 4: Parametric Model Assisted Survey Methods for Environmental Surveys
The manuscript “Adjustment Procedures to Account for Non-ignorable Missing Data in Environmental Surveys” by Drs. Muñoz-Hernández, Lesser, and Ruben Smith was accepted by Environmetrics. Drs. Lesser and Muñoz-Hernández are revising the manuscript in response to reviewer comments.
Project 5: Nonparametric Model Assisted Survey Estimation for Aquatic Resources Project (Joint with STARMAP Project 2)
Work continues in the use of nonparametric modeling for survey regression estimation, jointly supported by DAMARS and STARMAP. Numerous presentations have been made and several manuscripts have been submitted and published. Current problems of interest in this area combine landscape-level auxiliary data (such as those from GIS coverages) with field observations. The inferential problems range from model-assisted descriptive inferences for aquatic populations, to model-based small area estimates. Drs. Breidt and Opsomer, together with a postdoctoral fellow, students, and colleagues, extended earlier results on local polynomial survey regression estimation in a number of directions, some of which are described here.
The MS project of Alicia Johnson on cdf estimation was submitted for publication, and Siobhan Everson-Stewart’s Masters of Science project on two-dimensional kernel estimators was extended by Dr. Giovanna Ranalli to include the setting of penalized splines. Drs. Ranalli and Breidt also obtained promising preliminary results on the use of low-rank radial basis functions for smoothing data from river networks.
Dr. Ji-Yeon Kim successfully defended her Ph.D. under the direction of Drs. Breidt and Opsomer; she submitted a joint paper on two-stage local polynomial regression estimation. PhD student Jehad al-Jararha is adapting the two-stage local polynomial estimator to other cluster-mean models and other types of auxiliary information. New work was conducted with Gerda Claeskens on nonparametric model-assisted estimation using penalized splines, and was extended to include the case of small area estimation. Ph.D. student Mark Delorey is extending penalized spline estimators further to include the setting of two-stage sampling.
Drs. Nan-Jung Hsu and Hsin-Cheng Huang from Taiwan visited CSU as research scientists collaborating with Dr. Breidt and other STARMAP researchers during the period from August to November, 2004. Drs. Breidt and Hsu continued work on semiparametric modeling for increment averaged data, which occurs in soil and sediment core sampling. Dr. Breidt began to extend these methods to allow for semiparametric small area estimation of profiles, which are infinite dimensional parameters. Ph.D. student Bill Coar collaborated with Drs. Hsu and Breidt on autocorrelation diagnostics for increment data using Cholesky autocovariance models.
Mr. Coar and Dr. Breidt have been using similar techniques to develop a class of state-space models for stream networks, with corresponding Kalman-like filtering and fixed-network smoothing algorithms and likelihood-based estimation techniques. Drs. Hsu and Breidt are working on Bayesian estimation for non-Gaussian noninvertible moving average models, with potential application to spatial data from river networks. Dr. Hsin-Cheng Huang worked on improving the power for trend detection of lake water quality using spatio-temporal models. Drs. Hsu, Breidt, and Huang are developing a new variable selection method for multiple layers of GIS data using LASSO, and have collaborated with Dr. Theobald (STARMAP Project 3) in its implementation. Such a methodology would be applicable immediately to aquatic resource data, for which multiple layers of geospatial data are available as potential regressors.
Future Activities:
Remaining funding will support 1 additional year of research and outreach activities, and will continue to support graduate students who currently are in the program. In the coming year, we expect to address a number of issues identified during the SARMMM plenary session and to complete and submit manuscripts that currently are being prepared. We anticipate that many of these manuscripts will be targeted to the proposed special issue of Environmetrics.
Journal Articles: 16 Displayed | Download in RIS Format
Other center views: | All 142 publications | 19 publications in selected types | All 16 journal articles |
---|
Type | Citation | ||
---|---|---|---|
|
Andrews B, Davis RA, Breidt FJ. Maximum likelihood estimation for all-pass time series models. Journal of Multivariate Analysis 2006;97(7):1638-1659. |
R829096 (2003) R829095 (Final) R829095C002 (2003) R829095C002 (2004) |
Exit Exit |
|
Breidt FJ, Hsu N-J. Best mean square prediction for moving averages. Statistica Sinica 2005;15(2):427-446. |
R829096 (2003) R829096 (2005) R829095 (Final) R829095C002 (2003) R829095C002 (2004) R829095C002 (2005) |
Exit Exit |
|
Breidt FJ, Hsu N-J, Coar W. A diagnostic test for autocorrelation in increment-averaged data with application to soil sampling. Environmental and Ecological Statistics 2008;15(1):15-25. |
R829096 (2005) |
Exit |
|
Buchanan RA, Conquest LL, Courbois J-Y. A cost analysis of ranked set sampling to estimate a population mean. Environmetrics 2005;16(3):235-256. |
R829096 (2002) R829096 (2003) R829096 (2004) R829096 (2005) |
Exit |
|
Cooper C. Sampling and variance estimation on continuous domains. Environmetrics 2006;17(6):539-553. |
R829096 (2005) |
Exit |
|
Courbois JP, Urquhart NS. Comparison of survey estimates of the finite population variance. Journal of Agricultural, Biological, and Environmental Statistics 2004;9(2):236-251. |
R829096 (2003) R829096 (2004) R829096 (2005) R829095 (2004) R829095 (2005) R829095 (Final) R829095C003 (2003) R829095C003 (2004) |
Exit |
|
Da Silva DN, Opsomer JD. Properties of the weighting cell estimator under a nonparametric response mechanism. Survey Methodology 2004;30(1):45-55. |
R829096 (2004) R829096 (2005) R829095 (2003) R829095 (2004) R829095 (2005) R829095 (Final) R829095C002 (2004) R829095C002 (2005) |
Exit Exit |
|
Montanari GE, Ranalli MG. Nonparametric model calibration estimation in survey sampling. Journal of the American Statistical Association 2005;100(472):1429-1442. |
R829096 (2004) R829096 (2005) R829095 (Final) R829095C002 (2004) R829095C002 (2005) |
Exit |
|
Munoz B, Lesser VM. Adjustment procedures to account for non-ignorable missing data in environmental surveys. Environmetrics 2006;17(6):653-662. |
R829096 (2005) |
Exit |
|
Munoz B, Lesser VM, Ramsey FL. Design-based empirical orthogonal function model for environmental monitoring data analysis. Environmetrics 2008;19(8):805-817. |
R829096 (2003) R829096 (2004) R829096 (2005) |
Exit |
|
Opsomer JD, Botts C, Kim JY. Small area estimation in a watershed erosion assessment survey. Journal of Agricultural, Biological, and Environmental Statistics 2003;8(2):139-152. |
R829096 (2004) R829096 (2005) R829095 (2004) R829095 (2005) R829095 (Final) R829095C002 (2004) |
Exit Exit |
|
Opsomer JD, Breidt FJ, Moisen GG, Kauermann G. Model-assisted estimation of forest resources with generalized additive models. Journal of the American Statistical Association 2007;102(478):400-409. |
R829096 (2003) R829096 (2004) R829096 (2005) R829095 (2004) R829095 (2005) R829095 (Final) R829095C002 (2004) R829095C002 (2005) |
Exit Exit |
|
Opsomer JD, Claeskens G, Ranalli MG, Kauermann G, Breidt FJ. Non-parametric small area estimation using penalized spline regression. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 2008;70(1):265-286. |
R829096 (2005) R829095C002 (2005) |
Exit Exit |
|
Stevens Jr. DL, Olsen AR. Variance estimation for spatially balanced samples of environmental resources. Environmetrics 2003;14(6):593-610. |
R829096 (2003) R829096 (2005) |
Exit |
|
Stevens Jr. DL, Olsen AR. Spatially-balanced sampling of natural resources. Journal of the American Statistical Association 2004;99(465):262-278 |
R829096 (2002) R829096 (2004) |
not available |
|
Thomas DL, Johnson D, Griffith B. A Bayesian random effects discrete-choice model for resource selection: population-level selection inference. Journal of Wildlife Management 2006;70(2):404-412. |
R829096 (2005) R829095 (Final) |
Exit |
Supplemental Keywords:
DAMARS, STARMAP, statistical modeling, environmental statistics, surface water monitoring, nonparametric regression estimators,, RFA, Scientific Discipline, Ecosystem Protection/Environmental Exposure & Risk, Aquatic Ecosystems & Estuarine Research, Aquatic Ecosystem, Environmental Monitoring, EMAP, estuarine research, risk assessment, ecosystem monitoring, statistical survey design, spatial and temporal modeling, aquatic ecosystems, Environmental Monitoring and Assessment ProgramRelevant Websites:
http://oregonstate.edu/dept/statistics/epa_program/ Exit
Progress and Final Reports:
Original AbstractThe perspectives, information and conclusions conveyed in research project abstracts, progress reports, final reports, journal abstracts and journal publications convey the viewpoints of the principal investigator and may not represent the views and policies of ORD and EPA. Conclusions drawn by the principal investigators have not been reviewed by the Agency.