Science Inventory

Estimation of the upper bound of predictive performance for alternative models that use in vivo reference data (OpenTox USA 2017)


Pham, L., K. Paul-Friedman, Woodrow Setzer, AND M. Martin. Estimation of the upper bound of predictive performance for alternative models that use in vivo reference data (OpenTox USA 2017). Presented at OpenTox USA 2017, Durham, North Carolina, July 12 - 13, 2017.


Poster presentation at the OpenTox USA meeting. The purpose of this work is to characterize the variability in points of departure (PODs) derived from in vivo data in ToxRefDB, which are often used as a reference for predictive models. The major finding of this work is that the variance not explained by in vivo study parameters is approximately 0.33 log10(mg/kg/day) (and about 33% of the total variance in POD values). This means that the best any predictive model might do in predicting the available PODs in this set would be +/- 0.58 log10(mg/kg/day units). If the 'true' POD being predicted was 10 mg/kg/day, that means that a 'perfect' model might predict that POD between 2.63 - 38 mg/kg/day. Realistically, the predictive interval, when defined by 95% confidence limits, would likely make this prediction fall between 0.69 and 145 mg/kg/day. This work demonstrates that some of the variance in the available POD data cannot be explained by the study parameters, and therefore will limit the residual mean square error of any predictive model.


The number of chemicals with limited toxicological information for chemical safety decision-making has accelerated alternative model development, which often are evaluated via referencing animal toxicology studies. In vivo studies are generally considered the standard for hazard assessment, including point-of-departure (POD) determinations. However, the upper bounds of predictivity for these alternative models for endpoints of concern will be limited by the variability in the reference data. This work quantified the variance within in vivo toxicity studies to bound the expected predictive performance of models that reference in vivo studies. Using the US EPA Toxicity Reference Database (ToxRefDB) systemic toxicity POD values and associated study parameters, multiple linear regression and analysis of variance was performed to quantify the explained variance due to study parameters, e.g., chemical treatment, study type, species, strain, dose spacing. The mean squared error (MSE) was used to estimate the unexplained variance, i.e. the portion of the total variance that cannot be explained using known study parameters. The total variance in the set of log10 (POD) values was approximately 1, and the residual MSE after adjusting for study parameters was ~0.33. The root mean squared error (RMSE) was ~0.58; i.e., the best that any prediction model could do predicting log10(POD) would be within ± 0.58 log10(mg/kg/day) units. Chemical treatment explained 0.46 log10 (POD) of the total variance, whereas the other study conditions explained substantially less. As chemical treatment appeared to account for a significant fraction of the explained variability, two approaches were used to evaluate the impact of higher level chemical descriptors on explained variance. The purpose of this analysis was to understand if chemical structure descriptors could be used as a surrogate for chemical identity in the model, which might indicate that tools such as read-across would be useful for predicting systemic PODs for new chemicals. The first approach stratified the dataset using chemical class, and the second approach used ToxPrint chemotype fingerprints to group chemicals based on structural features. Use of chemical class and the study parameters marginally improved the explained variance over study parameters alone for some classes. Use of chemotype fingerprints failed to provide groupings that would reduce the number of chemical descriptors needed to achieve the same level of explained variance as using individual chemical identity. Both of these approaches demonstrated the limited capacity for chemical structure to predict complex, systemic POD endpoints for a heterogeneous dataset. This characterization of unexplained variance in in vivo data suggests that the upper bound on the residual MSE for predictive models of in vivo POD data may approach 0.33 log10(mg/kg/day). This abstract may not reflect U.S. EPA policy.




Record Details:

Product Published Date: 07/13/2017
Record Last Revised: 03/20/2018
OMB Category: Other
Record ID: 339991