Development of Quantitative Structure-Activity Relationship (QSAR) Models to Predict the Carcinogenic Potency of Chemicals


EPA is announcing the release of the final report, Development of Quantitative Structure-Activity Relationship Models to Predict the Carcinogenic Potency of Chemicals . This summary describes some of the recent progress in this area that has been conducted by the National Center for Environmental Assessment (NCEA). The primary objectives of this research were: (1) to test the feasibility of using alternative toxicity measures as surrogates for cancer potency determinations that would typically be derived from 2-year chronic bioassays, and (2) to develop QSAR models to predict the cancer potency of environmental chemicals. This research provides an alternate approach to the risk assessment community in their efforts to prioritize and screen potential carcinogenic chemicals at hazardous sites for clean-up considerations.


Determining the carcinogenicity and carcinogenic potency of new chemicals is both a labor-intensive and time-consuming process. In order to expedite the screening process, there is a need to either: (1) identify alternative toxicity measures (shorter duration) that may be used as surrogates for carcinogenic potency (chronic duration), or (2) develop quantitative structure-activity relationship (QSAR) models to predict the cancer slope factors of environmental chemicals. To understand this better, this project has been divided into two parts; the first part focuses on using the alternative toxicity measures to predict carcinogenic potency, and the second part concentrates on the development of QSAR models for predicting carcinogenic potency.

Part I. Alternative toxicity measures for carcinogenic potency currently being used in the literature include lethal dose (dose that kills 50% of a study population [LD(50)]), lowest-observed-adverse-effect-level (LOAEL) and maximum tolerated dose (MTD). The first aim of this study was to investigate the correlation between tumor dose (TD[50]) and individual alternative toxicity measures (i.e., LD50, LOAEL and MTD) as an estimator of carcinogenic potency. A second aim of this study was to develop a Classification and Regression Tree (CART) between TD(50) and a combination of predictor variables (i.e., all three alternative toxicity measures, as well as additional parameters such as mutagenicity and logP) to predict the carcinogenic potency of new chemicals. Rat TD(50)s of 590 structurally diverse chemicals were obtained from the Cancer Potency Database (CPDB;, and the three alternative toxicity measures considered in this study were either obtained from experimental data available in the published literature or estimated using TOPKAT, a toxicity estimation software. Poor correlations were obtained between carcinogenic potency and the individual alternative toxicity measures (both experimental and TOPKAT) for the CPDB chemicals. Three types of CART models were developed: (1) experimental values only, (2) mixed, experimental and missing values (estimated from the average of a particular cluster), and (3) predicted values only. NCEA found that the CART model (1) developed using experimental data with no missing values as predictor variables had the best predictivity and provided reasonable estimates of TD(50) for nine chemicals that were part of an external validation set. However, to increase the number of chemical representatives with estimable cancer potencies, if experimental values for the three alternative measures, mutagenicity and logP are not available in the literature, then either the carcinogenic potency estimated from CART developed using either missing experimental values (CART model 2) or estimated values (CART model 3) may be used for making a prediction.

Part II. The overall risk associated with exposure to a chemical is determined by combining quantitative estimates of exposure to the chemical with its known health effects. For carcinogenic chemicals, oral slope factors (OSFs) and inhalation unit risks (IURs) are used to quantitatively estimate carcinogenic risk when combined with exposure information. Frequently, there is a lack of animal or human studies in the literature to determine OSFs or IURs. This study aimed to circumvent this problem by developing quantitative structure-activity relationship (QSAR) models to predict the OSFs of chemicals without toxicity data. The OSFs of 70 chemicals based on human, rat, and mouse bioassay data were obtained from the United States Environmental Protection Agency's Integrated Risk Information System (IRIS) database. A global QSAR model that considered all 70 chemicals as well as species and/or sex-specific QSARs were developed in this study. Study results indicate that the species and sex-specific QSARs (r(2)>0.8, q(2)>0.7) had better predictive abilities than the global QSAR developed using data from all species and sexes (r(2)=0.77, q(2)=0.73). The QSARs developed in this study were externally validated, and demonstrated reasonable predictive abilities.



Jan 2009Manuscript published on Development of Quantitative Structure-Activity Relationship (QSAR) Models to Predict the Carcinogenic Potency of Chemicals. I. Alternative Toxicity Measures as an Estimator of Carcinogenic Potency.
Dec 2009Book chapter published on Structure-Activity Relationships for Carcinogenic Potential.
Mar 2011Manuscript published on Development of Quantitative Structure-Activity Relationship Models to Predict the Carcinogenic Potency of Chemicals. II: Using Oral Slope Factor as a Measure of Carcinogenic Potency.
Sep 2011Completion of the project; final summary posted on this Web site.
Related Link(s)
For more information contact:

Nina Ching Y. Wang
  • by phone at:   513-569-7752
  • by fax at:   513-487-2541
  • by email at: