Science Inventory

Predicting Chromatography-tandem Mass Spectrometry Amenability to Improve Non-targeted Analysis (ACS Fall 2020)

Citation:

Lowe, C., K. Isaacs, Chris Grulke, J. Sobus, E. Ulrich, A. Chao, AND A. Williams. Predicting Chromatography-tandem Mass Spectrometry Amenability to Improve Non-targeted Analysis (ACS Fall 2020). American Chemical Society Fall meeting, San Francisco, California, August 16 - 20, 2020. https://doi.org/10.23645/epacomptox.12988733

Impact/Purpose:

With the increasing availability of high-resolution mass spectrometers, suspect screening and non-targeted analysis are becoming popular compound-identification tools for environmental researchers. Samples of interest often contain a large (unknown) number of chemicals spanning the detectable mass range of the instrument. In an effort to separate these chemicals prior to injection into the mass spectrometer, a chromatography method is utilized. There are numerous combinations of either gas or liquid chromatographs coupled to various types of mass spectrometers available. Depending on the tandem instrument used for analysis, the researcher is likely to observe a different subset of compounds that make up the sample based on the amenability of those chemicals to be studied by the specific experimental technique and associated method. It would be advantageous if this subset of chemicals could be predicted prior to experiment, in order to rule out tentative identifications not amenable to the particular instrument. In this work, we combine the results of data assembled as a result of the EPA non-targeted analysis collaborative trial (ENTACT: https://www.epa.gov/sciencematters/epas-entact-study-breaks-new-ground-non-targeted-research) along with the tandem mass spectrometry data in the MONA database (MassBank of North America). The assembled dataset, to date, totals 15,542 unique chemicals with at least one detection in GCMS (both derivatized and non-derivatized compounds) and LCMS (both +ve and -ve modes for both ESI & APCI). The resulting detected/not-detected matrix has been combined with molecular descriptors (PaDEL: http://www.yapcwsoft.com/dd/padeldescriptor/) to model which chemicals are amenable to specific methods. We have constructed both k-nearest neighbor and random forest models for each method. To account for predictions made outside the scope of each model, we have implemented local and global applicability domain measures. This abstract does not necessarily represent the views or policies of the U.S. Environmental Protection Agency.

Description:

With the increasing availability of high-resolution mass spectrometers, suspect screening and non-targeted analysis are becoming popular compound-identification tools for environmental researchers. Samples of interest often contain a large (unknown) number of chemicals spanning the detectable mass range of the instrument. In an effort to separate these chemicals prior to injection into the mass spectrometer, a chromatography method is utilized. There are numerous combinations of either gas or liquid chromatographs coupled to various types of mass spectrometers available. Depending on the tandem instrument used for analysis, the researcher is likely to observe a different subset of compounds that make up the sample based on the amenability of those chemicals to be studied by the specific experimental technique and associated method. It would be advantageous if this subset of chemicals could be predicted prior to experiment, in order to rule out tentative identifications not amenable to the particular instrument. In this work, we combine the results of data assembled as a result of the EPA non-targeted analysis collaborative trial (ENTACT: https://www.epa.gov/sciencematters/epas-entact-study-breaks-new-ground-non-targeted-research) along with the tandem mass spectrometry data in the MONA database (MassBank of North America). The assembled dataset, to date, totals 15,542 unique chemicals with at least one detection in GCMS (both derivatized and non-derivatized compounds) and LCMS (both +ve and -ve modes for both ESI & APCI). The resulting detected/not-detected matrix has been combined with molecular descriptors (PaDEL: http://www.yapcwsoft.com/dd/padeldescriptor/) to model which chemicals are amenable to specific methods. We have constructed both k-nearest neighbor and random forest models for each method. To account for predictions made outside the scope of each model, we have implemented local and global applicability domain measures. This abstract does not necessarily represent the views or policies of the U.S. Environmental Protection Agency.

Record Details:

Record Type:DOCUMENT( PRESENTATION/ SLIDE)
Product Published Date:08/20/2020
Record Last Revised:11/20/2020
OMB Category:Other
Record ID: 350218