Science Inventory

Using Random Forests to Map Floodplains for the Conterminous USA

Citation:

Woznicki, S., J. Baynes, S. Panlasigui, M. Mehaffey, AND A. Neale. Using Random Forests to Map Floodplains for the Conterminous USA. US-IALE 2018 Annual Meeting, Chicago, IL, April 08 - 12, 2018.

Impact/Purpose:

New dataset for EnviroAtlas; new method to map floodplain areas in CONUS, filling in large areas that are currently unmapped.

Description:

Floodplains perform several important ecosystem services, including storing water during precipitation events and reducing peak flows, thereby reducing flooding of adjacent communities. Understanding the relationship between flood inundation and floodplains is critical for ecosystems’ and communities’ health and well-being, as well as targeting floodplain and riparian restoration. Many communities in the United States, particularly those in rural areas, lack flood inundation maps due to the high cost of flood modeling. Only 60% of the conterminous United States has been mapped through the Federal Emergency Management Agency (FEMA) Flood Insurance Rate Maps (FIRM) program. Therefore, we developed a complete 30-meter resolution flood inundation map of the conterminous United States using random forests with existing FIRM 100-year floodplains as training data. Random forests are an ensemble machine learning method for classification tasks, and have been used in the past for applications such as land cover classification and disaster identification. Input datasets included digital elevation model (DEM)-derived variables, flood-related soil characteristics, and land cover. Models were trained and tested at the hydrologic unit code level two (HUC-2) scale and each 30-m pixel in the CONUS was classified as floodplain or not-floodplain. The most important variables were typically vertical distance to channel and overland flow distance (both DEM derivatives) and soils’ dominant flood frequency class (e.g. rare, occasional, frequent), although their relative importance varied by HUC. Classification accuracy was done using the F1 statistic, which balances precision and recall of the model when compared to the FIRMs. The models performed well in the eastern and Midwest CONUS, but were less robust in the arid southwest, likely due to greater topographic complexity, coarser soils data, and lack of quality model training data. However, the overall performance of the random forest models in this context demonstrates the method’s utility for completing the remaining 40% of the CONUS without mapped floodplains.

Record Details:

Record Type:DOCUMENT( PRESENTATION/ SLIDE)
Product Published Date:04/12/2018
Record Last Revised:04/16/2018
OMB Category:Other
Record ID: 340403