The low resolution mass spectra of a set of 78 toxic volatile organic compounds were examined for information concerning chemical classes. These compounds were predominately chloro- and/or bromoaromatics, -alkanes, or -alkenes, which are routinely sought at trace levels in ambient air samples using gas chromatography-mass spectrometry. The set of mass spectra of the pure compounds contained 151 different masses in the range of 35 to 256. The Shannon information content for each mass channel was calculated for the binary encoded and the full intensity spectra, using 1% of the base peak as the threshold level. The 17 masses with the highest Shannon information content were retained as a basis set (the compressed set) for SIMCA pattern recognition using principal component analysis. Examination of the inherent class structure of the compressed set of mass spectra showed a separation of the data into two major classes: aromatics and alkaenes (alkanes and alkenes). The aromatic compounds could be divided into two subclasses: chloro-and nonchloroaromatics.