Science Inventory

20190825 - Crafting Persistent Identifiers and Structure-based Representations in DSSTox as Surrogates for Chemical Names (ACS Fall 2 of 3)


Grulke, C., A. Richard, AND A. Williams. 20190825 - Crafting Persistent Identifiers and Structure-based Representations in DSSTox as Surrogates for Chemical Names (ACS Fall 2 of 3). American Chemical Society Fall Meeting, San Diego, CA, August 25 - 29, 2019.


Abstract and presentation at the American Chemical Society Fall Meeting August 2019 for Session: Chemical Nomenclature & Representation: Past, Present & Future.


Nomenclature has been key to the conveyance of chemically associated information between scientists for over a century, as well as in the unstructured data environments that existed prior to the development of large-scale chemical databases. EPA’s National Center of Computational Toxicology focuses on the collection and aggregation of hazard, exposure, and persistence data linked to chemicals to support environmental risk assessment. For this purpose, unique and intransient identifiers and structures provide more useful representations of chemical substances than human interpretable (also variable, error-prone and malleable) names. Unique DSSTox substance identifiers (DTXSIDs) and substance-list record identifiers (DTXRIDs) provide a simple way to separately manage chemical information associated with a particular sample or data source from a generic substance representation capable of aggregating data from many sources. When assignable, a chemical structure provides an unambiguous and information-rich representation of a substance that is universally interpretable by chemists and not subject to the permutations of chemical names. A unique DSSTox structure-identifier (DTXCID), in turn, provides an efficient indexing of a structure. Despite the value of such indexing, chemical names are, and will remain in widespread use as chemical currency by the public and across scientific and regulatory domains. As such, names will continue to serve as primary linkages to source data and, thus, will continue to play an essential role in ensuring the accuracy of structure assignments, as well as for indexing in cases where structures cannot be assigned. As we extend our structure and substance storage methods to better document partially and ill-defined chemistry, the importance of the assigned name within our databases will likely wane, but it currently serves as the most heavily weighted source identifier when attempting to resolve a DTXRID to a DTXSID. This abstract does not necessarily represent the views or policies of the U.S. Environmental Protection Agency.

Record Details:

Product Published Date: 08/29/2019
Record Last Revised: 09/05/2019
OMB Category: Other
Record ID: 346352