Science Inventory

Modeling lake trophic state: a random forest approach

Citation:

Hollister, Jeff, Bryan Milstead, AND B. Kreakie. Modeling lake trophic state: a random forest approach. Ecosphere. ESA Journals, 7(3):e01321, (2016).

Impact/Purpose:

Harmful algal blooms (HABs) have received a great deal of interest over the last year or so. One of the key research issues related to HABs is the ability to predict the likelihood of their occurrence. Many methods exist to be able to do that, but have not been tried across broad scales nor with widely available landscape data. The impact of this research is that we address both of these issues. We have proven that data mining approaches, combined with ubiquitous landscape data can accurately predict lake trophic state and the lake trophic state is strongly associated with cyanobacteria abundance.

Description:

Productivity of lentic ecosystems has been well studied and it is widely accepted that as nutrient inputs increase, productivity increases and lakes transition from low trophic state (e.g. oligotrophic) to higher trophic states (e.g. eutrophic). These broad trophic state classifications are good predictors of ecosystem health and ecosystem services and disservices (e.g. recreation, aesthetics, fisheries, and harmful algal blooms). While the relationship between nutrients and trophic state provides reliable predictions, it requires *in situ* water quality data in order to parameterize the model. This limits the application of these models to lakes with existing and, more importantly, available water quality data. To expand our ability to predict trophic state in lakes without water quality data, we take advantage of the availability of a large national lakes water quality database, land use/land cover data, lake morphometry data, other universally available data, and modern data mining approaches to build and assess models of lake tropic state that may be more universally applied. We use random forests and random forest variable selection to identify variables to be used for predicting trophic state and we compare the performance of two sets of models of trophic state (as determined by chlorophyll *a* concentration). The first set of models estimates trophic state with *in situ* as well as universally available data and the second set of models uses universally available data only. For each of these models we used three separate trophic state categories, for a total of six models. Overall accuracy for models built from *in situ* and universal data ranged from `r round(min(acc_results[1:3,2]),3)`% to `r round(max(acc_results[1:3,2]),3)`%. For the universal data only models, overall accuracy ranged from `r round(min(acc_results[4:6,2]),3)`% to `r round(max(acc_results[4:6,2]),3)`%. Lastly, it is believed that the presence and abundance of cyanobacteria is strongly associated with trophic state. To test this we examine the association between estimates of cyanobacteria abundance and measured chrlorophyll *a* and find a positive relationship. Expanding these preliminary results to include cyanobacteria taxa indicates that cyanobacteria are significantly more likely to be found in highly eutrophic lakes. These results suggest that predictive models of lake trophic state may be improved with additional information on the landscape surrounding lakes and that those models provide additional information on the presence of potentially harmful cyanobacteria taxa.

Record Details:

Record Type:DOCUMENT( JOURNAL/ PEER REVIEWED JOURNAL)
Product Published Date:03/21/2016
Record Last Revised:03/21/2016
OMB Category:Other
Record ID: 311430