Science Inventory

Evaluating Urban Background Metal Concentration Clusters with Bayesian Networks: Southeastern Urban Centers in EPA Region 4

Citation:

Carriger, John F., Robert G. Ford, T. Frederick, S. Chan, AND Y. Fung. Evaluating Urban Background Metal Concentration Clusters with Bayesian Networks: Southeastern Urban Centers in EPA Region 4. SETAC North America 41st Annual Meeting, NA, Virtual, November 15 - 19, 2020.

Impact/Purpose:

To present a characterization of an urban backgrounds metals database with Bayesian networks at the North American SETAC meeting.

Description:

Soils in urban settings are likely to contain elevated levels of certain metals due to human activity, non-point source industrial operations, and from infrastructure materials. Because these increased contaminant concentrations are due to anthropogenic urban activity and not site-related point source releases, they can be considered to represent urban background soil concentrations. Whether certain soil contaminant concentrations are site-related or are part of natural or anthropogenic background is a question that frequently arises during environmental site investigations in urban settings. However, robust data on urban background concentrations on a large scale have been lacking, which can complicate decision making. The U.S. Environmental Protection Agency recently collected a comprehensive sampling of background surface soil concentrations to support risk assessment and risk management of urban locations in the Southeastern U.S. Even with a comprehensive data set, setting threshold concentrations for metals for individual sites can be difficult, especially in urban settings, given the varying background and historical contributions to concentrations in different soils. Bayesian networks are useful for machine learning and discovering patterns in data. One machine learning tool that is especially useful for examining is data clustering. In order to identify clusters in the background urban database, a factor variable was constructed by grouping the urban background metals sample data into multiple clusters with expectation maximization. A Naïve Bayesian network was the basis for the clustering where each of the analyzed metals was a child node of the latent factor variable that contains the clusters as variable states. Relative magnitudes of different metal concentrations in the clusters were examined through a conditional means analysis for interpreting the clustering. The joint probability between the latent factor variable and a variable containing the cities for the sampling sites, as a localized grouping, were extracted for comparison. Some cities had unique clusters while others shared multiple background metal clusters. The clustering analysis can be useful for isolating, grouping, and/or comparing assumptions about background data when the clusters for the metals are homogeneous with respect to the data, stable, and interpretable with scientific knowledge of the differences in background concentrations.

URLs/Downloads:

EVALUATING URBAN BACKGROUND METAL CONCENTRATION CLUSTERS.PDF  (PDF, NA pp,  532.528  KB,  about PDF)

Record Details:

Record Type:DOCUMENT( PRESENTATION/ POSTER)
Product Published Date:11/19/2020
Record Last Revised:12/11/2020
OMB Category:Other
Record ID: 350385