Aggregate Measures of Watershed Health from Reconstructed Water Quality Data with Uncertainty
Hoque, Y., S. Tripathi, M. Hantush, AND R. Govindaraju. Aggregate Measures of Watershed Health from Reconstructed Water Quality Data with Uncertainty. Ed Gregorich (ed.), JOURNAL OF ENVIRONMENTAL QUALITY. American Society of Agronomy, MADISON, WI, 45(2):709-719, (2016).
The overall objective of this study is to devise a methodology to assess watershed health by calculating aggregate R-R-V values that would retain information for multiple WQ constituents at select locations within a watershed. Uncertainty was to be engaged in the calculation of these aggregate R-R-V values. Specific steps needed to meet this objective were: (1) Obtain reconstructed daily time-series data and associated reconstruction error for multiple WQ constituents at several locations along stream networks of study watersheds using an RVM. (2) Apply VBaNPCA to reduce the multidimensional WQ data set to a single- dimension data set ; also obtain uncertainty associated with individual values in the reduced series and examine the role of error propagation through the data reconstruction and dimensionality reduction steps. (3) Calculate R-R-V values based on the new aggregate data set; examine how aggregate values compare with their individual constituent counterparts.
Risk-based indices such as reliability, resilience, and vulnerability (R-R-V), have the potential to serve as watershed health assessment tools. Recent research has demonstrated the applicability of such indices for water quality (WQ) constituents such as total suspended solids and nutrients on an individual basis. However, the calculations can become tedious when time-series data for several WQ constituents have to be evaluated individually. Also, comparisons between locations with different sets of constituent data can prove difficult. In this study, data reconstruction using relevance vector machine algorithm was combined with dimensionality reduction via variational Bayesian noisy principal component analysis to reconstruct and condense sparse multidimensional WQ data sets into a single time series. The methodology allows incorporation of uncertainty in both the reconstruction and dimensionality-reduction steps. The R-R-V values were calculated using the aggregate time series at multiple locations within two Indiana watersheds. Results showed that uncertainty present in the reconstructed WQ data set propagates to the aggregate time series and subsequently to the aggregate R-R-V values as well. serving as motivating examples. Locations with different WQ constituents and different standards for impairment were successfully combined to provide aggregate measures of R-R-V values. Comparisons with individual constituent R-R-V values showed that variability present in the original multi-dimensional WQ dataset propagates to the aggregate time-series, and subsequently, to aggregate R-R-V values as well. The data-driven approach to calculating aggregate R-R-V values was found to be useful for providing a composite picture of watershed health. Aggregate R-R-V values also enabled comparison between locations with different types of WQ data.