You are here:
Structure-based QSAR Models to Predict Repeat Dose Toxicity Points of Departure
Citation:
Pradeep, P., K. Paul-Friedman, AND R. Judson. Structure-based QSAR Models to Predict Repeat Dose Toxicity Points of Departure. Computational Toxicology. Elsevier B.V., Amsterdam, Netherlands, 16(November 2020):100139, (2020). https://doi.org/10.1016/j.comtox.2020.100139
Impact/Purpose:
Toxicity in repeat dose studies refers to a range of adverse effects on one or more systems in adult animals, including changes in bodyweight, or gross and/or histopathological changes in organs such as the liver, kidney, etc. Toxicity can be measured in terms of different levels of effects based on a dose-response assessment, including the dose at which effects were first observed, i.e. lowest effect level (LEL), lowest observed effect level (LOEL) and low observed adverse effect level (LOAEL), and at the doses at which no effects were observed, i.e. the no effect level (NEL), no observed effect level (NOEL) or no observed adverse effect level (NOAEL). The NOAEL and LOAEL are defined via expert toxicological review to determine a critical effect level, whereas the LEL signifies the lowest dose at which a statistically significant effect was observed and the NEL is the next lowest dose in the dose index [13]. Throughout, we will use the term “effect level” to mean any dose in a study that showed a statistically significant treatment-related effect relative to study-level controls. A point of departure (POD) is a chemical-level value of the dose where treatment-related effects will occur and is derived from the experimental effect values across studies. Repeat dose toxicity data for a chemical can come from several different sources that measure the same or different effect levels in repeat dose studies that can have similar or different experimental designs. Therefore, a chemical with multiple studies will have multiple effect levels associated with it. Effect levels are experimental values from individual studies. To maximize the dataset available for training models, it is advantageous to combine all of these effect level data to overcome the effects of potential outliers on quantitative estimates of effect levels (or doses) and effect level variability. The goal of the current analysis is to develop methods to predict “idealized” PODs that could be used for prioritization or screening level risk assessment. This idealized POD would be the result of multiple studies across multiple species and study types, with large dose ranges and small dose spacing. However, even if such a battery of studies was run, there would still be uncertainty in the value of the POD due to experimental and biological variability. Development of QSAR models for predicting PODs is challenging given the variability in experimental data. When computational models are developed using such variable data, the predictions are inherently subject to prediction uncertainty since the model performance will be evaluated using variable experimental data. Thus, uncertainty in the training data will lead to uncertainty in the computational model and estimates of its performance. Incorporation of variability in computational model development, and subsequent quantification of data-driven uncertainty in model predictivity, are critically needed to improve the reliability and acceptance of computational models for screening level risk assessment.
Description:
Human health risk assessment for environmental chemical exposure is limited by a vast majority of chemicals with little or no experimental in vivo toxicity data. Data gap filling techniques, such as quantitative structure–activity relationship (QSAR) models based on chemical structure information, can predict hazard in the absence of experimental data. Risk assessment requires identification of a quantitative point-of-departure (POD) value, the point on the dose-response curve that marks the beginning of a low-dose extrapolation. This study presents two sets of QSAR models to predict POD values (PODQSAR) for repeat dose toxicity. For training and validation, a publicly available in vivo toxicity dataset for 3592 chemicals was compiled using the U.S. Environmental Protection Agency’s Toxicity Value database (ToxValDB). The first set of QSAR models predict point-estimates of POD values (PODQSAR) using structural and physicochemical descriptors for repeat dose study types and species combinations. A random forest QSAR model using study type and species as descriptors showed the best performance, with an external test set root mean square error (RMSE) of 0.71 log10-mg/kg/day and coefficient of determination (R2) of 0.53. The second set of QSAR models predict the 95% confidence intervals for PODQSAR using a constructed POD distribution with a mean equal to the median POD value and a standard deviation of 0.5 log10-mg/kg/day, based on previously published typical study-to-study variability that may lead to uncertainty in model predictions. Bootstrap resampling of the pre-generated POD distribution was used to derive point-estimates and 95% confidence intervals for each POD prediction. Enrichment analysis to evaluate the accuracy of PODQSAR showed that 80% of the 5% most potent chemicals were found in the top 20% of the most potent chemical predictions, suggesting that the repeat dose POD QSAR models presented here may help inform screening level human health risk assessments in the absence of other data.
URLs/Downloads:
DOI: Structure-based QSAR Models to Predict Repeat Dose Toxicity Points of Departure