Science Inventory

Data Mining Framework for Collecting Chemical-Centric Data for End-of-Life Flow Inventory

Citation:

Hernandez-Betancur, J., Gerardo J. Ruiz-Mercado, AND M. Martin. Data Mining Framework for Collecting Chemical-Centric Data for End-of-Life Flow Inventory. The American Institute of Chemical Engineers (AIChE) Annual Meeting, Phoenix, AZ, November 13 - 18, 2022.

Impact/Purpose:

Estimating life cycle inventories (LCIs) of chemical releases is critical for environmental, health, and safety evaluations. This presentation will show a data mining framework to extract and transform data from publicly accessible, siloed, and multi-country database systems, whose applicability domain is for generating LCI of chemical transfers to off-site facilities for EoL management. The framework can help to deal with regulatory and environmental law differences across geographical locations and over the years by incorporating LCI data from different countries and capturing insights from the data about reporting environmental criteria from consecutive years.

Description:

Tracking chemical flows and collecting life cycle inventories (LCI) are crucial steps for identifying potential exposure scenarios at the chemical end-of-life (EoL) stage. Nonetheless, addressing these tasks is time-consuming and challenging. Data-driven modeling is considered a powerful tool to streamline the identification of exposure scenarios, potential environmental releases, and material transfers. However, the first step is to build a data pipeline for collecting and preparing the data to be ingested into a data-driven model for training and retraining. This work presents a data mining framework to extract and transform data from publicly accessible, siloed, and multi-country database systems, whose applicability domain is for the LCI of chemical EoL off-site transfers. The framework has requirements to integrate database systems: (i) these are available in English, (ii) whether they are chemical-centric or focused on individual chemicals instead of total transferred amounts, and (iii) whether their data granularity is enough to describe the entities involved in a transfer. Thus, the collected data describes the generator, chemical, and type of EoL activity (e.g., surface impoundment) involved in transferring a chemical contained in a waste stream to an off-site location for EoL management. An exploratory data analysis shows the implications and limitations of the data for being used by data-driven models like classifiers, e.g., to predict potential EoL chemical exposure scenarios. The data mining pipeline can provide datasets at an annual rate and deal with the decay of the data-driven model performance over time due to changes in the statistical distribution of independent variables (e.g., generator industry sector). Moreover, the data-mining framework deals with changes in the relationship between the independent variables and the target one (e.g., a potential EoL activity and a chemical of concern). Also, the framework can help to deal with regulatory and environmental law differences across geographical locations and over the years by incorporating LCI data from different countries and capturing insights from the data about reporting environmental criteria from one year to the next.

Record Details:

Record Type:DOCUMENT( PRESENTATION/ SLIDE)
Product Published Date:11/18/2022
Record Last Revised:03/11/2024
OMB Category:Other
Record ID: 360692