Jump to main content.


Analyzing Data

pixel.gif
 This image is a drawing of a caddisfly larva in its case. Caddisflies are aquatic insects that are used by biologists to monitor the environmental quality of streams.


DA.6. Using Statistics Responsibly

Once data have been gathered, scrutinized for quality assurance, and potentially grouped or normalized according to appropriate parameters, the analysis of the trends and relationships may begin. This page provides advice on the proper use of statistical analysis within the framework of Stressor Identification (SI).

DA.6.1. Know the Data

Know the data

It is important to know your data in order to avoid making errors in applying analytical methods or interpreting output.

Summary statistics and graphics facilitate comparisons, reveal the distribution of the data, help you decide whether to transform data, and provide insights into which analyses to use. Graphical methods (e.g. scatter plots and box plots) often help to determine whether the data support or weaken a candidate cause.

Quantify relationships between effects variables and measures of candidate causes, and among variables representing steps in a causal sequence. Correlation analyses measure the degree of association between variables. Regression analysis is the foundational method for quantifying relationships among variables. Other methods (e.g., species sensitivity distributions (SSDs), predicting environmental conditions from biological observations, and data normalization) rely on regression.

DA.6.2. Interpreting Differences

Caution

Caution is required when interpreting differences between site and reference observations or changes over a stressor gradient. Differences should be interpreted in terms of magnitude and consistency rather than statistical significance. Use caution when testing hypotheses that site observations differ from reference observations or that a biological response changes over a stressor gradient because:

6.3. Jumping to Conclusions

Inappropriate

Concluding that a candidate stressor is or is not the cause based on hypothesis testing results or the strength of a statistical relationship (e.g., a correlation coefficient) is inappropriate because:

  • Stressors often covary with each other and with natural environmental attributes. A strong relationship between the biological response and candidate cause could reflect a covarying stressor or natural factor other than the candidate cause,
  • Hypothesis testing was designed for interpreting controlled experiments with replicates and random assignment of treatments, and
  • Field data from observational studies rarely include replicates and "treatments" are not randomly assigned, therefore
  • Even strong associations do not prove causation.

Rather than relying on a statistical result, use all available types of evidence in the CADDIS inferential logic.


Fundamentals of Data Analysis Home    Previous Page    Next Page


Local Navigation


Jump to main content.