Science Inventory

Estimation of Carcinogenicity using Hierarchical Clustering and Nearest Neighbor Methodologies

Citation:

MARTIN, T. M. Estimation of Carcinogenicity using Hierarchical Clustering and Nearest Neighbor Methodologies . Presented at The Scarlett Workshop on in silico methods for carcinogenicity and mutagenicity, Milan, ITALY, April 02 - 04, 2008.

Impact/Purpose:

To inform the public.

Description:

Previously a hierarchical clustering (HC) approach and a nearest neighbor (NN) approach were developed to model acute aquatic toxicity end points. These approaches were developed to correlate the toxicity for large, noncongeneric data sets. In this study these approaches applied to rodent carcinogenicity data sets. The HC approach uses Ward’s method to divide an experimental toxicity training set into a series of structurally similar clusters. The structural similarity is defined in terms of 2-D and 3-D descriptors. A genetic algorithm based technique is used to generate statistically valid QSAR models for each cluster. The toxicity for a given query compound is estimated using the average of the predictions from the cluster models whose chemicals are the most structurally similar to the query compound. In the NN method, the predicted toxicity is simply the average of the toxicities of the three closest analogs in the training set (provided they exceed a minimum degree of similarity). The NN approach provides a baseline prediction for other prediction methodologies. These approaches achieved external cross validation prediction concordances of 60-65% for the CPDB database.

Record Details:

Record Type:DOCUMENT( PRESENTATION/ ABSTRACT)
Product Published Date:04/03/2008
Record Last Revised:07/18/2008
OMB Category:Other
Record ID: 189447