Risk Assessment of Food Allergenicity by a Data Base Approach

EPA Grant Number: R833137
Title: Risk Assessment of Food Allergenicity by a Data Base Approach
Investigators: Braun, Werner , Goldblum, Randall M. , Schein, Catherine H.
Current Investigators: Braun, Werner
Institution: The University of Texas Medical Branch - Galveston
EPA Project Officer: Hahn, Intaek
Project Period: October 1, 2006 through September 30, 2009
Project Amount: $600,000
RFA: Biotechnology: Potential Allergenicity of Genetically Engineered Foods (2006) RFA Text |  Recipients Lists
Research Category: Food Allergy , Health Effects , Health


The overall goal of the proposal is the further development of our Structural Database of Allergenic Proteins (SDAP) (http://fermi.utmb.edu/SDAP/ Exit ) and of its bioinformatics tools to estimate the potential allergenicity of novel recombinant food proteins. SDAP was developed as part of a long-standing collaborative project between the PI and co-investigators at UTMB. The research project will develop bioinformatics research tools to assess human allergenicity of proteins in genetically engineered foods, and experimentally test these software tools with pollen allergens. Our main hypothesis is that we can find sequence and structural motifs specific for each allergen family, which can be archived in a searchable data base. The data base can then be used to estimate the risk of allergenicity, associated with new food proteins.


We will make a concerted effort to improve our approach for risk assessment by developing methods for rapidly identifying the locations and structural characteristics of conformational IgE epitopes. Experimental evidence suggests that conformational epitopes are important for food allergies. For instance, in the last few years much of the increase in the frequency of allergic reactions to foods is due to the recognition of oral allergy syndromes (OAS). These food reactions occur in patients who have become sensitized to inhaled (typically plant) proteins. When these patients ingest foods that contain proteins to which their IgE antibodies cross-react they develop immediate symptoms in the oral cavity and pharynx. Most of these reactions only occur on exposure to fresh fruits and vegetables and not cooked foods. This strongly suggests that the structures recognized on the food protein represent conformational epitopes on the non-denatured proteins.


Several proprietary bioinformatics tools are already implemented in SDAP, such as the bioinformatics rules for potential allergenicity suggested by FAO/WHO and EFSA committees. We will use a large-scale statistical study to systematically test the validity of these rules and the effect of changing their parameters on their sensitivity and specificity. General structure-activity relations among allergens will be examined by analyzing sequence requirements for potential IgE epitopes. We have established a new sequence similarity score for linear epitopes, the property distance (PD) score based on physico-chemical profiles. The PD score is implemented in SDAP, and we demonstrated that the PD score can reliably identify IgE cross-reactivity. We will expand this approach to a larger pool of IgE epitopes of other food allergens, and develop a quantitative scale for assessing the risk of allergenicity, based on the PD values.

Potential conformational IgE epitopes of novel food products can be virtually screened, by analyzing the topography and spatial relationship of the small number of experimentally identified IgE epitopes of all known allergens, as well as the new structure identified experimentally. For most of the known allergens, we can use an experimental 3D structure or generate a 3D model of sufficient quality to characterize these epitopes.

A number of allergen sequence databases have been developed for different purposes in the last several years. In each case, different criteria were used to determine which sequences to include, how to maintain and upgrade the data, what additional data to include, and what terminology to use to describe the relationships between the proteins. We propose the development of an allergen ontology, using extensible markup language (XML) tags. It will be possible for each allergen database to incorporate the use of this ontology and a set of XML tags, without altering the underlying data. This approach would promote cross functionality between these resources.

Expected Results:

All software tools to estimate the risk of allergenicity of recombinant food products will be available to the scientific community on our SDAP Web server. Another result of this project is the systematic, statistical analysis of current FAO/WHO and EFSA bioinformatics tools to screen novel recombinant food products for potential allergenicity. Novel insights on the sequential and 3D characteristics of currently known allergens and on quantitative descriptors for allergenicity could provide an improved scientific basis for new bio-safety regulations.

Publications and Presentations:

Publications have been submitted on this project: View all 46 publications for this project

Journal Articles:

Journal Articles have been submitted on this project: View all 13 journal articles for this project

Supplemental Keywords:

human health, risk assessment, genetically engineered foods, allergic response, plant allergens, cedar trees, protein families, Bayesian approach, phage display library,, Health, Scientific Discipline, Health Risk Assessment, Risk Assessments, Allergens/Asthma, Biochemistry, Biology, food allergenicity, genetically engineered food, dietary proteins, human exposure, oral allergy syndrome, bioinformatics, data base development, allergic response

Progress and Final Reports:

  • 2007 Progress Report
  • 2008 Progress Report
  • Final Report