Research Grants/Fellowships/SBIR

Final Report: Computational Tools for the Prediction and Classification of Estrogenic Compounds

EPA Grant Number: R826133
Title: Computational Tools for the Prediction and Classification of Estrogenic Compounds
Investigators: Welsh, William J.
Institution: University of Missouri - St Louis
EPA Project Officer: Deener, Kacee
Project Period: January 1, 1998 through December 31, 2000
Project Amount: $433,758
RFA: Endocrine Disruptors (1997) RFA Text |  Recipients Lists
Research Category: Economics and Decision Sciences , Endocrine Disruptors , Health , Safer Chemicals



The growing national concern that certain exogenous chemicals, known collectively as endocrine disrupting compounds (EDCs), can disrupt the sensitive endocrine systems of humans and wildlife by mimicking endogenous hormones has resulted in federal legislation mandating that the U.S. Environmental Protection Agency (EPA) develop a screening and testing program to assess the EDC activity of a large and growing number of widely prevalent chemicals. This prodigious and expensive task, which will require extensive in vitro and in vivo testing against multiple biological endpoints, could be greatly facilitated by implementing paradigms based on computer-based molecular models that enable rapid identification and prediction of potential EDCs. Computer-based tools were developed and validated to enable rapid and large-scale screening of chemicals in terms of their potential EDC activity as a basis for prioritizing these chemicals for subsequent biological testing.

The overall objectives of this project were to: (1) construct and validate an integrated array of quantitative tools for modeling and predicting potential EDCs based on Quantitative Structure-Activity Relationship (QSAR) models, which use physicochemical properties derived solely from chemical structure; (2) explore correlations between estrogenic activities predicted by the QSAR models for one species and experimental activities for another species (inter-species extrapolation); and (3) extend the QSAR models to predict estrogenic activity in vitro across the hierarchy of molecular-level biological complexity, beginning with ligand-estrogen receptor binding and progressing toward gene regulation and transcription. The ultimate goal of this research was to formulate predictive models as guides for predicting endocrine disrupting effects both in vitro and in vivo for a large number of natural and synthetic chemicals from numerous structural classes. These models offered significant promise as guides for selecting regulatory protocols and in prioritizing EDCs for investigation based on their predicted danger to the environment. These models also may have significant utility in screening compounds for further testing in a regulatory process.

Summary/Accomplishments (Outputs/Outcomes):

Significant progress was achieved during the project. Specific accomplishments are summarized as follows:

? Several three-dimensional (3D) QSAR models for both estrogen receptor (ER) subtypes ER-a and ER-b were developed using Comparative Molecular Analysis (CoMFA) based on relative binding affinity (RBA) data for a series of 31 structurally diverse chemicals. These QSAR models exhibited excellent internal consistency (r2 > 0.95) and predictive ability (q2 > 0.6). The CoMFA contour plots, as well as other aspects of the models, suggest a close similarity between the receptors in terms of their mode of binding, yet at the same time, provide a rational basis for ligand selectivity.

? A 3D-QSAR model was constructed using CoMFA based on RBA data spanning four orders of magnitude for 53 2-phenylindoles from an ER binding assay using calf uterine cytosol. This 3D-QSAR model exhibited excellent self-consistency and predictive ability. To examine the possibility of interspecies extrapolation, the calf ER-derived CoMFA model was used to predict the calf ER RBA values for 14 estrogenic compounds for which human ER RBA values were available. The correlation between these predicted calf ER RBA values and corresponding experimental human ER RBA values was quite high (r = 0.80) especially considering the species-to-species differences. A separate QSAR model constructed using classical physicochemical descriptors as the independent variables was shown to be inferior in statistical quality to 3D-QSAR models derived from CoMFA.

? Based on the yeast-based reporter gene assay data of Coldham, et al. (Environmental Health Perspectives 1997;107(7):734-741) for a data set of 53 compounds whose structures cover several chemical classes (e.g., steroids, synthetic estrogens, phytoestrogens, antiestrogens, DDTs, polychlorinated biphenyls (PCBs), industrial chemicals), 3D-QSAR models derived from CoMFA have been constructed that exhibit a high degree of self consistency (r2 > 0.912) and predictive ability (q2 > 0.481). Three different alignment schemes (SEAL, atom fit, field fit) were tested to obtain the best CoMFA model. The field-fit alignment scheme showed the best predictive CoMFA model (r2 = 0.950, q2 = 0.606). For the sake of comparison, several Hologram QSAR (HQSAR) models also were constructed that also yielded a high degree of predictive ability (r2 = 0.893, q2 = 0.666).

? A substantial body of evidence indicates that both humans and wildlife suffer adverse health effects from exposure to environmental chemicals that are capable of interacting with the endocrine system. The recent cloning of the estrogen receptor b subtype (ER-b) suggests that the selective effects of estrogenic compounds may arise in part by the control of different subsets of estrogen-responsive promoters by the two ER subtypes, ER-a and ER-b. To identify the structural prerequisites for ligand-ER binding and to discriminate ER-a and ER-b in terms of their ligand-binding specificities, CoMFA was employed to construct a 3D-QSAR model on a data set of 31 structurally diverse compounds for which competitive binding affinities have been measured against both ER-a and ER-b. Structural alignment of the molecules in CoMFA was achieved by maximizing overlap of their steric and electrostatic fields using the Steric and Electrostatic ALignment (SEAL) algorithm. The final CoMFA models, generated by correlating the calculated 3D steric and electrostatic fields with the experimentally observed binding affinities using partial least-squares (PLS) regression, exhibited excellent self-consistency (r2 > 0.99) as well as high internal predictive ability (q2 > 0.65) based on crossvalidation. CoMFA-predicted values of RBA for a test set of compounds outside of the training set were consistent with experimental observations. These CoMFA models can serve as guides for the rational design of ER ligands that possess preferential binding affinities for either ER-a or ER-b. These models also can prove useful in risk assessment programs to identify real or suspected EDCs.

? The vaporization enthalpies of 16 polychlorinated biphenyls have been determined by correlation gas chromatography. This study was prompted by the realization that the vaporization enthalpy of the standard compounds used in previous studies, octadecane and eicosane, were values measured at 340 and 362 K, respectively, rather than at 298 K. Adjustment to 298 K amounts to a 7-8 kJ/mol increment in the values. With the inclusion of this adjustment, vaporization enthalpies evaluated by correlation gas chromatography are in good agreement with the values determined previously in the literature. The present results are based on the vaporization enthalpies of several standards whose values are well established in the literature. The standards include a variety of n-alkanes and various chlorinated hydrocarbons. The vaporization enthalpies of PCBs increased with the number of chlorine atoms and were found to be larger for meta- and para-substituted polychlorinated biphenyls.

? Genomic effects of the active form of estrogen, 17b-estradiol, are mediated through at least two members of the steroid hormone receptor superfamily, ER-a and ER-b. Although the X-ray crystal structure of the ER-a has been elucidated, coordinates of the ER-b currently are not publicly available. Based on the significant structural conservation across members of the steroid hormone receptor family, and the high sequence homology between ER-a and ER-b (>60 percent), a homology model of the ER-b structure has been developed. Using the crystal structure of ER-a and the homology model of ER-b, a strong correlation was demonstrated between computed values of the binding energy and published values of the observed relative binding affinity (RBA) for a variety of compounds for both receptors.

? 3D-QSPR) models have been constructed using CoMFA to correlate the sublimation enthalpies at 298.15 K of a series of PCBs with their CoMFA-calculated physicochemical properties. Various alignment schemes, such as inertial, atom fit, as is, and field fit were employed in this study. Separate CoMFA models were developed using different partial charge formalisms, namely, electrostatic potential (ESP) and Gasteiger-Marsili (GM) charges. Among the five different CoMFA models constructed for sublimation enthalpy (DsubHm(298.15 K)), the model that combined atom fit alignment and ESP charges yielded the greatest self-consistency (r2 = 0.979) and internal predictive ability (rcv2 = 0.764). This CoMFA model was used to predict DsubHm(298.15 K) of PCBs for which the corresponding experimental values are unavailable in the literature.

? 3D-QSPR models have been derived using CoMFA to correlate the vaporization enthalpies of a representative set of PCBs at 298.15 K with their CoMFA-calculated physicochemical properties. Various alignment schemes, such as inertial, as is, and atom fit were employed in this study. The CoMFA models also were developed using different partial charge formalisms, namely, electrostatic potential charges (ESP) and Gasteiger-Marsili (GM) charges. The most predictive model for vaporization enthalpy (DvapHm(298.15 K)), with atom fit alignment and GM charges, yielded r2 values 0.852 (cross-validated) and 0.996 (conventional). The vaporization enthalpies of PCBs increased with the number of chlorine atoms and were found to be larger for the meta- and para-substituted isomers. This model was used to predict DvapHm(298.15 K) of the entire set of 209 PCB congeners.

? CoMFA has been used to develop 3D-QSPR models for the fusion enthalpy at the melting point (DfusHm(Tfus)) of a representative set of PCBs. Various alignment schemes, such as inertial, atom fit, field fit, and as is were used in this study to evaluate the predictive capabilities of the models. The CoMFA models also have been derived using partial atomic charges calculated from ESP and GM methods. The combination of atom fit alignment and GM charges yielded the greatest self-consistency (r2 = 0.955) and internal predictive ability (rcv2 = 0.783). This CoMFA model was used to predict DfusHm(Tfus) of the entire set of 209 PCB congeners.

? QSARs attempt to correlate chemical structure with activity using statistical approaches. The QSAR models are useful for various purposes including the prediction of activities of untested chemicals. QSARs and other related approaches have attracted broad scientific interest, particularly in the pharmaceutical industry for drug discovery and in toxicology and environmental science for risk assessment. An assortment of new QSAR methods have been developed during the past decade, most of them are focused on drug discovery. Besides advancing the fundamental knowledge of QSARs, these scientific efforts have stimulated their application in a wider range of disciplines such as toxicology, where QSARs have not yet gained full appreciation. In this study, the status of QSAR was summarized with emphasis on illuminating the utility and limitations of QSAR technology. 2D QSAR methods were studied with a discussion of the availability and appropriate selection of molecular descriptors. 3D QSAR and key issues associated with this technology were then described, and the relative suitability of 2D and 3D QSAR was compared for different applications. Given the recent technological advances in biological research for rapid identification of drug targets, there are several examples in which QSAR approaches are employed in conjunction with improved knowledge of the structure and function of the target receptor. The study concludes by discussing statistical validation of QSAR models, a topic that has received sparse attention in recent years despite its critical importance.

? A novel patent pending technique was developed and applied for rapidly comparing
the shapes of molecules to each other, and to target receptor sites/subsites. This approach,
which we call Shape Signatures, involves generating compact representations of shape that will enable rapid shape comparisons of library compounds against a target receptor (e.g., estrogen receptors) or against other compounds (e.g., known estrogenic compounds). The proposed implementation of shape signatures will automatically incorporate the conformational flexibility of molecules and will bundle charge-based information (including hydrogen bond donors and acceptors) into the signature. Shape signatures are being applied in a number of typical risk assessment scenarios, involving both ligand- and receptor-based strategies. Although the focus of this study will be the estrogens due to the availability of suitable ligand and receptor data, this new technology is fully applicable to other classes of EDCs such as the androgens and the retinoid and thyroid hormones. Shape signatures are being extended to locate compounds in a database that are similar to compounds of known estrogenic activity. Compounds will be identified that are complementary to the binding pocket within the ligand binding domain (LBD) of estrogen receptors a and b (ERa and ERb, respectively). For this purpose, the available crystal-structure geometry of the ERa LBD and a 3D homology model of the ERb LBD recently constructed by the Welsh laboratory is being employed. The ultimate goal is to develop a tool for general use by scientists involved in risk assessment activities, including those who are not experts in computational chemistry or modeling. By its design, this tool will be easy to use, fast, extensible, physically intuitive, and visually accessible through a graphics user interface (GUI). At the same time, it will be capable of identifying/predicting potential EDCs that are conformationally flexible in many cases and that span across chemical families varying widely in terms of molecular structure.

Journal Articles on this Report : 15 Displayed | Download in RIS Format

Other project views: All 68 publications 15 publications in selected types All 15 journal articles
Type Citation Project Document Sources
Journal Article DeLisle RK, Yu SJ, Nair AC, Welsh WJ. Homology modeling of the estrogen receptor subtype β (ER-β) and calculation of ligand binding affinities. Journal of Molecular Graphics & Modelling 2001;20(2):155-167 R826133 (Final)
not available
Journal Article Fang H, Tong WD, Welsh WJ, Sheehan DM. QSAR models in receptor-mediated effects: the nuclear receptor superfamily. Journal of Molecular Structure-Theochem 2003;622(1-2):113-125 R826133 (Final)
not available
Journal Article Ouyang M, Welsh WJ, Georgopoulos P. Gaussian mixture clustering and imputation of microarray data. Bioinformatics 2004;20(6):917-923 R826133 (Final)
not available
Journal Article Perkins R, Fang H, Tong WD, Welsh WJ. Quantitative structure-activity relationship methods: Perspectives on drug discovery and toxicology. Environmental Toxicology and Chemistry 2003;22(8):1666-1679. R826133 (Final)
not available
Journal Article Puri S, Chickos JS, Welsh WJ. Determination of vaporization enthalpies of polychlorinated biphenyls by correlation gas chromatography. Analytical Chemistry 2001;73(7):1480-1484. R826133 (Final)
not available
Journal Article Puri S, Chickos JS, Welsh WJ. Three-dimensional quantitative structure-property relationship (3D-QSPR) models for prediction of thermodynamic properties of polychlorinated biphenyls (PCBs): Enthalpy of sublimation. Journal of Chemical Information and Computer Sciences 2002;42(1):109-116. R826133 (Final)
not available
Journal Article Puri S, Chickos JS, Welsh WJ. Three-dimensional quantitative structure-property relationship (3D-QSPR) models for prediction of thermodynamic properties of polychlorinated biphenyls (PCBs): Enthalpies of fusion and their application to estimates of enthalpies of sublimation and aqueous solubilities. Journal of Chemical Information and Computer Sciences 2003;43(1):55-62. R826133 (Final)
not available
Journal Article Tabb MM, Kholodovych V, Grün F, Zhou C, Welsh WJ, Blumberg B. Highly chlorinated PCBs inhibit the human xenobiotic response mediated by the steroid and xenobiotic receptor (SXR). Environmental Health Perspectives 2004;112(2):163-169. R826133 (Final)
CR830686 (2003)
CR830686 (2006)
not available
Journal Article Tamura H, Yoshikawa H, Gaido KW, Ross SM, DeLisle RK, Welsh WJ, Richard AM. Interaction of organophosphate pesticides and related compounds with the androgen receptor. Environmental Health Perspectives 2003;111(4):545-552 R826133 (Final)
not available
Journal Article Tong W, Welsh WJ, Shi LM, Fang H, Perkins R. Structure-activity relationship approaches and applications. Environmental Toxicology and Chemistry 2003;22(8):1680-1695. R826133 (Final)
not available
Journal Article Tong W, Lowis DR, Perkins R, Chen Y, Welsh WJ, Goddette DW, Heritage TW, Sheehan DM. Evaluation of quantitative structure-activity relationship methods for large-scale prediction of chemicals binding to the estrogen receptor. 1998;38(4):669-677. R826133 (Final)
not available
Journal Article Xing L,Welsh WJ, Tong W, Perkins R, Sheehan DM. Comparison of estrogen receptor α and β subtypes based on comparative molecular field analysis (CoMFA). SAR and QSAR in Environmental Research 1999;10(2-3):215-237. R826133 (Final)
not available
Journal Article Yoon S, Welsh WJ. Identification of a minimal subset of receptor conformations for improved multiple conformation docking and two-step scoring. Journal of Chemical Information and Computer Sciences 2004;44(1):88-96 R826133 (Final)
not available
Journal Article Yu SJ, Keenan SM, Tong W, Welsh WJ. Influence of the structural diversity of data sets on the statistical quality of three-dimensional quantitative-activity relationship (3D-QSAR) models: Predicting the estrogenic activity of xenoestrogens. Chemical Research in Toxicology 2002;15(10):1229-1234 R826133 (Final)
not available
Journal Article Zauhar RJ, Moyna G, Tian LF, Li ZJ, Welsh WJ. Shape signatures: A new approach to computer-aided ligand- and receptor-based drug design. Journal of Medicinal Chemistry 2003;46(26):5674-5690 R826133 (Final)
not available
Supplemental Keywords:

endocrine disruptors, endocrine disrupting compounds, risk assessment, quantitative structure-activity relationships, QSARs, estrogens., RFA, Health, Scientific Discipline, Environmental Chemistry, Health Risk Assessment, Endocrine Disruptors - Environmental Exposure & Risk, endocrine disruptors, Risk Assessments, Analytical Chemistry, Biochemistry, Children's Health, Biology, Endocrine Disruptors - Human Health, adverse outcomes, risk assessment, metabolites, expert systems, computational tool, Quantitative Structure-Activity Relationship, animal models, developmental processes, human exposure, estrogen response

Progress and Final Reports:
Original Abstract