Science Inventory

Novel text analytics approach to identify relevant literature for human health risk assessments: A pilot study with health effects of in utero exposures

Citation:

Cawley, M., R. Beardslee, B. Beverly, A. Hotchkiss, E. Kirrane, R. Sams, A. Varghese, J. Wignall, AND J. Cowden. Novel text analytics approach to identify relevant literature for human health risk assessments: A pilot study with health effects of in utero exposures. ENVIRONMENT INTERNATIONAL. Elsevier B.V., Amsterdam, Netherlands, 134:105228, (2019). https://doi.org/10.1016/j.envint.2019.105228

Impact/Purpose:

Systematic review methods improve the transparency and objectivity of literature-based evaluation in human health risk assessments. Human health risk assessment literature searches assessing cumulative risk typically identify many studies (often tens of thousands of studies). Reviewing large bodies of literature in the context of resource constraints requires innovative approaches for identifying relevant literature for review and data extraction. We demonstrate that machine learning techniques including supervised clustering, a type of semi-supervised learning, and machine learning can be used to eliminate the need to manually screen most search results for a comprehensive search to identify human health impacts resulting from in utero exposure to environmental chemicals. The information contained in this manuscript may be of interest to risk assessors from multiple disciplinary backgrounds as well as systematic review experts.

Description:

Systematic review methods improve the transparency and objectivity of literature-based evaluation in human health risk assessments. To adequately address human health effects for multiple organ systems across a broad dose range and various routes of exposure the approach to developing these assessments must evolve to accommodate the expansive literature that must be assessed to characterize risk fully. This is particularly true when evaluating cumulative health risks from multiple agents and stressors. Human health risk assessment literature searches assessing cumulative risk typically identify many studies (often tens of thousands of studies). Reviewing large bodies of literature in the context of resource constraints requires innovative approaches for identifying relevant literature for review and data extraction. We demonstrate that machine learning techniques including supervised clustering, a type of semi-supervised learning, and machine learning can be used to eliminate the need to manually screen most search results for a comprehensive search to identify human health impacts resulting from in utero exposure to environmental chemicals. Supervised methods of machine learning require training data that can be time consuming to gather. We use a novel approach for our initial training corpus that appropriates a readily available, expert-curated set of studies from US EPA's Integrated Risk Information System (IRIS) program. The machine learning techniques that we used were found to be comparable to expert review of literature and were demonstrated using a case study approach.

Record Details:

Record Type:DOCUMENT( JOURNAL/ PEER REVIEWED JOURNAL)
Product Published Date:11/08/2019
Record Last Revised:09/18/2023
OMB Category:Other
Record ID: 351532