You are here:
QQ-plots for assessing distributions of biomarker measurements and generating defensible summary statistics
Pleil, J. QQ-plots for assessing distributions of biomarker measurements and generating defensible summary statistics. Journal of Breath Research. Institute of Physics Publishing, Bristol, Uk, 10(3):035001, (2016).
Most environmental and biological measurement datasets are lognormally distributed (Limpert at al. 2001, Pleil et al. 2014). This feature has a real-‐world interpretation: measurements in complex environmental and biological systems can be the result of some combinations of circumstances resulting in rare and very high values, but can never return a true negative number; the shape of the lognormal distribution accounts for this reality. Although it is possible that measurement data are sometimes normally distributed, the methods for interpreting data described here can be used for either situation.
One of the main uses of biomarker measurements is to compare different populations to each other and to assess risk in comparison to established parameters. This is most often done using summary statistics such as central tendency, variance components, confidence intervals, exceedance levels and percentiles. Such comparisons are only valid if the underlying assumptions of distribution are correct. This article discusses methodology for interpreting and evaluating data distributions using quartile-quartile plots (QQ-plots) and making decisions as to how to treat outliers, interpreting effects of mixed distributions, and identifying left-censored data. The QQ-plot graph is shown to be a simple and elegant tool for visual inspection of complex data and deciding if summary statistics should be performed after log-transformation.