Science Inventory

Watershed clustering for Eco-hydrologic assessment using machine learning

Citation:

Muche, M., S. Sinnathamby, AND J. Johnston. Watershed clustering for Eco-hydrologic assessment using machine learning. 2019 International Society for Ecological Modelling Global Conference, Salzburg, AUSTRIA, October 01 - 05, 2019.

Impact/Purpose:

Presented at the International Society for Ecological Modelling Global Conference 2019

Description:

Land use/cover, soil properties, climatic conditions, and topographic characteristics affect the hydrological processes of a watershed. Quantifying water quantity and quality characteristics of multiple watersheds (and at larger geographic areas) requires a systematic approach to catchment classification. Paired watershed design has been used to study the impact of land use change using neighboring watersheds. The approach is especially important for ungauged watershed assessment and management. The premise is that watersheds located in closer proximity would respond similarly to climate and land use/cover change. Though paired watershed assessment has been used widely, the method lacks objective criteria. This study applied machine learning that includes K-means and hierarchical-agglomerative clustering techniques to identify hydrologically similar, paired watersheds (Hydrologic Unit Code-12 level) in North and South Carolina. The USEPA StreamCat national dataset was used to explore catchment-related landscape features that would enhance the national applicability of our methodology. Mean values of HUC-12 watershed elevation, % imperviousness, water table depth, permeability, runoff, baseflow index, precipitation (30-year average), temperature (min and mean 30-year average), % developed area, % crop land use, % forest, topographic index, % clay, and % sand were used for the analysis. These variables play important roles in hydrologic processes by influencing surface runoff behavior and affecting overland and subsurface flow velocity that determine the rate of runoff. Landscape variables also reveal the closeness of channel network spacing and terrain heterogeneity that affects convergence/divergence of flow, soil moisture, and infiltration. Standardization was performed on all data by calculating z-scores (based on mean and standard deviation) using the StandardScaler tool to improve generalizability of the results. K-means and agglomerative hierarchical clustering techniques were applied to categorize similar watersheds. The methodology demonstrates the utility of machine learning for systematic analysis of water quantity and quality, especially in the case of ungauged watersheds.

Record Details:

Record Type:DOCUMENT( PRESENTATION/ SLIDE)
Product Published Date:10/05/2019
Record Last Revised:10/02/2019
OMB Category:Other
Record ID: 346900