Science Inventory

A Random Forest Approach to Predict the Spatial Distribution of Sediment Pollution in an Estuarine System


Walsh, E., B. Kreakie, M. Cantwell, AND D. Nacci. A Random Forest Approach to Predict the Spatial Distribution of Sediment Pollution in an Estuarine System. PLoS ONE . Public Library of Science, San Francisco, CA, 12(7):e0179473, (2017).


The work presented in this manuscript provides insight about potential hotspots of triclosan in Narragansett Bay. The methods use opportunistically collected data from different research projects and analyze it in a robust manner to make bay wide predictions about triclosan. Since this type of data is common, this approach may have broad potential applications


Modeling the magnitude and distribution of sediment-bound pollutants in estuaries is often limited by incomplete knowledge of the site and inadequate sample density. To address these modeling limitations, a decision-support tool framework was conceived that predicts sediment contamination from the sub-estuary to broader estuary extent. For this study, a Random Forest (RF) model was implemented to predict the distribution of a model contaminant, triclosan (5-chloro-2-(2,4-dichlorophenoxy)phenol) (TCS), in Narragansett Bay, Rhode Island, USA. TCS is an unregulated contaminant used in many personal care products. The RF explanatory variables were associated with TCS transport and fate (proxies) and direct and indirect environmental entry. The continuous RF TCS concentration predictions were discretized into three levels of contamination (low, medium, and high) for three different quantile thresholds. The RF model explained 63% of the variance with a minimum number of variables. Total organic carbon (TOC) (transport and fate proxy) was a strong predictor of TCS contamination causing a mean squared error increase of 59% when compared to permutations of randomized values of TOC. Additionally, combined sewer overflow discharge (environmental entry) and sand (transport and fate proxy) were strong predictors. The discretization models identified a TCS area of greatest concern in the northern reach of Narragansett Bay (Providence River sub-estuary), which was validated with independent test samples. This decision-support tool performed well at the sub-estuary extent and provided the means to identify areas of concern and prioritize bay-wide sampling.

URLs/Downloads:   Exit

Record Details:

Product Published Date: 07/24/2017
Record Last Revised: 08/21/2017
OMB Category: Other
Record ID: 337294