Science Inventory

Structured extraction of comprehensive chemical data from Wikipedia

Citation:

Sinclair, G., I. Thillainadarajah, B. Meyer, V. Samano, S. Sivasupramaniam, L. Adams, A. Richard, AND A. Williams. Structured extraction of comprehensive chemical data from Wikipedia. Fall ACS, Chicago, IL, August 21 - 25, 2022. https://doi.org/10.23645/epacomptox.20496756

Impact/Purpose:

N/A

Description:

Wikipedia provides individual data pages for over 20,000 chemicals, making it a ubiquitous resource for open chemical information. Its collaborative community model introduces unique opportunities and challenges for the reuse of that information. Our new effort to comprehensively harvest the chemical space on Wikipedia, integrating automated, semi-automated, and human data extraction processes, has provided a structured dataset for comparison and reuse across sources. We discuss previous such efforts and expansions on their processes; correlate this dataset with other open resources; and analyze the derived data in relation to those resources. This abstract does not necessarily represent the views or policies of the U.S. Environmental Protection Agency.

Record Details:

Record Type:DOCUMENT( PRESENTATION/ SLIDE)
Product Published Date:08/25/2022
Record Last Revised:08/31/2022
OMB Category:Other
Record ID: 355598