Science Inventory

A comparison of three liquid chromatography (LC) retention time prediction models

Citation:

McEachran, A., K. Mansouri, S. Newton, B. Beverly, J. Sobus, AND A. Williams. A comparison of three liquid chromatography (LC) retention time prediction models. TALANTA. Elsevier Science Ltd, New York, NY, 182:371-379, (2018). https://doi.org/10.1016/j.talanta.2018.01.022

Impact/Purpose:

This paper compares the relative predictive ability and applicability to NTA workflows of three RT prediction models: (1) a logP (octanol-water partition coefficient)-based model using EPI SuiteTM logP predictions; (2) a commercially available ACD/ChromGenius model; and, (3) a newly developed Quantitative Structure Retention Relationship model called OPERA-RT.

Description:

High-resolution mass spectrometry (HRMS) data has revolutionized the identification of environmental contaminants through non-targeted analysis (NTA). However, chemical identification remains challenging due to the vast number of unknown molecular features typically observed in environmental samples. Advanced data processing techniques are required to improve chemical identification workflows. The ideal workflow brings together a variety of data and tools to increase the certainty of identification. One such tool is chromatographic retention time (RT) prediction, which can be used to reduce the number of possible suspect chemicals within an observed RT window. This paper compares the relative predictive ability and applicability to NTA workflows of three RT prediction models: (1) a logP (octanol-water partition coefficient)-based model using EPI SuiteTM logP predictions; (2) a commercially available ACD/ChromGenius model; and, (3) a newly developed Quantitative Structure Retention Relationship model called OPERA-RT. Models were developed using the same training set of 78 compounds with experimental RT data and evaluated for external predictivity on an identical test set of 19 compounds. Both the ACD/ChromGenius and OPERA-RT models outperformed the EPI SuiteTM logP-based RT model (R2=0.81-0.92, 0.86-0.83, 0.66-0.69 for training-test sets, respectively). Further, both OPERA-RT and ACD/ChromGenius predicted 95% of RTs within a ± 15% chromatographic time window of experimental RTs. Based on these results, we simulated an NTA workflow with a ten-fold larger list of candidate structures generated for formulae of the known test set chemicals using the U.S. EPA’s CompTox Chemistry Dashboard (https://comptox.epa.gov/dashboard), RTs for all candidates were predicted using both ACD/ChromGenius and OPERA-RT, and RT screening windows were assessed for their ability to filter out unlikely candidate chemicals and enhance potential identification. Compared to ACD/ChromGenius, OPERA-RT screened out a greater percentage of candidate structures within a 3 minute RT window (60% vs. 40%) but retained fewer of the known chemicals (42% vs. 83%). By several metrics, the OPERA-RT model, generated as a proof-of-concept using a limited set of open source data, performed as well as the commercial tool ACD/ChromGenius when constrained to the same small training and test sets. As the availability of RT data increases, we expect the OPERA-RT model’s predictive ability will increase.

URLs/Downloads:

https://doi.org/10.1016/j.talanta.2018.01.022   Exit

https://doi.org/10.1016/j.talanta.2018.01.022   Exit

Record Details:

Record Type: DOCUMENT (JOURNAL/PEER REVIEWED JOURNAL)
Product Published Date: 05/15/2018
Record Last Revised: 07/19/2018
OMB Category: Other
Record ID: 341570

Organization:

U.S. ENVIRONMENTAL PROTECTION AGENCY

OFFICE OF RESEARCH AND DEVELOPMENT

NATIONAL CENTER FOR COMPUTATIONAL TOXICOLOGY