Grantee Research Project Results
2006 Progress Report: Development of a Virulence Factor Biochip and its Validation for Microbial Risk Assessment in Drinking Water
EPA Grant Number: R831628Title: Development of a Virulence Factor Biochip and its Validation for Microbial Risk Assessment in Drinking Water
Investigators: Rose, Joan B. , Whittam, Thomas S. , Gulari, Erdogan , Hashsham, Syed
Institution: Michigan State University , University of Michigan
EPA Project Officer: Packard, Benjamin H
Project Period: November 1, 2004 through October 31, 2007 (Extended to April 30, 2009)
Project Period Covered by this Report: November 1, 2005 through October 31, 2006
Project Amount: $600,000
RFA: Microbial Risk in Drinking Water (2003) RFA Text | Recipients Lists
Research Category: Drinking Water , Water
Objective:
To develop a DNA biochip capable of screening and identifying key waterborne microorganisms from water samples as well as virulence factors that have been associated with health risks and validate the use of this biochip. The biochip will include bacteria and viruses from the Contaminant Candidate List, indicator bacteria, and key virulence factors identified as part of the Virulence-Factor Activity Relationships (VFAR) pilot project. This project will focus on interpretation of the results using a risk-based analysis for drinking water.
Progress Summary:
The concept of using genetic databases for identifying microbial risks in water, coined as Virulence-factor Activity Relationships (VFARs) was first developed by The Committee on Drinking Water Contaminants, National Research Council as an approach to screen microorganisms for their occurrence in water and/or their ability to cause harm and waterborne disease. We are currently developing a bioinformatics pilot program for the assessment of VFARs and have developed a biochip known as GeneScreen for the detection of E.coli bacteria exploiting the sequence variability inherent in 16S and 23S rRNAs, spacer region, and virulence and functional genes which can provide identification (taxonomy) to the genus and species, as well as pathogenicity and potential risk.
In our original proposal, we had proposed the development and validation of a high density biochip for some 15 groups or genera/species of targeted bacteria and viruses some of which are on the CCL and others important for the assessment of the microbial safety of water. We targeted traditional indicator organisms, pathogenic bacterial strains and viruses, as well as key virulence factors. Our specific goals were: 1. Selection of gene targets to encompass the microorganisms of interest to water safety; 2. Designing probes to uniquely identify each of the above microorganisms and provide reliable detection; 3. Synthesize microfluidic biochips containing the above set of probes in replicate with positive and negative controls (Biochip synthesis); 4. Validate and field-test the synthesized biochips using standard individual targets, appropriate target mixtures, and field samples (Validation and field-testing); and 5. Undertake a pilot risk analysis of a water system(s) testing a variety of computational approaches for interpreting the results of the biochip (Analysis). In this report we summarize the work done and results achieved thus far in three areas – a. the selection, design, synthesis and validation of a bacterial indicator chip, b. the selection, design, synthesis and validation of a viral microarray and c. the selection, design of an enterococcus indicator biochip.
1. INDICATOR CHIP
We have designed an in situ synthesized indicator biochip containing 8,000 probes (45-mers). The probes target gene sequences obtained from the organisms shown in Table 1. An earlier version of the biochip contained only 16S rRNA genes from these organisms. However, evaluation of the chip with environmental samples containing fecal matter did not result in human specific marker. This was somewhat expected considering the poor resolution of 16S as a marker of host. Full length 16S rRNA gene amplicons from horse fecal sample were hybridized individually, and hybridized as a mixture with amplicons from human fecal samples. Results showed a stronger correlation in signal intensity (R2=0.87) between 2 complex backgrounds when hybridized together, compared to the correlation in signal intensity when hybridized separately (R2=0.58). This observation suggests that the presence of background sample significantly influences results of the diagnostic, reducing the ability to discriminate between fecal sources. Primers targeting various conserved regions throughout the 16S rRNA gene were used to amplify fragments with 2 separate lengths. A 377 bp amplicon and a full length 16S rRNA amplicon were examined by hybridizing each separately, and in a complex background. Results suggest that signal intensity of both amplicons were influenced by complex background. Considering that a conserved region of the 16S rRNA gene is used to amplify both targets, significant interaction on dangling ends may still be occurring with the shorter target. Thus, targets must be amplified using specific regions of 16S rRNA genes. This eliminates the ability to use universal primers for target amplification.
Due to complications observed with 16S rRNA genes and bias of universal amplification, subsequent diagnostics are being performed with functional and marker genes. The success of selecting primer/probe signatures for potential indicators of human fecal contamination depends upon many factors but the following 3 are most important: a) targeting multiple marker genes simultaneously, b) selecting signatures related to fecal sources for each marker gene, and c) probe/primer design.
a) For the former factor, potential indicator genes were manually selected (from NCBI) with functions related to: virulence, enterotoxin, antiobotic resistance, bacteriocins, and cellulose degradation. Over 900 Protein IDs were collected and used as “seeds” with The Functional Gene Pipeline Repository (FGPR), operated by the Center for Microbial Ecology at Michigan State University. FGPR was used to automatically collect and update (monthly) all available homologues (in NCBI) for each gene family.
b) The ability to obtain signatures related to a specific fecal source for a given marker gene was also examined. For example, the shiga toxin 2 gene (stx2) from E. coli has been isolated from a number of organisms. A nucleotide tree comparing the 957 bp stx2 gene from 4 sources (cattle, sheep, deer, and pig) shows that the relationship between alleles can discriminate between sources for cattle and pig (Figure 1.). However, only one allele from sheep differs from the 2 alleles from deer. Therefore, the difference in alleles can not always be used to find signatures specific to various fecal sources. This stresses the importance of knowing the source of alleles for designing specific signatures. The source of 638 alleles for 22 functional genes has been examined and will be used for primer/probe design.
Fig 1 A Nucletide tree of the 957 bp long shiga toxin 2 gene (stx2) from E. coli. Alleles from various sources including deer, sheep, pig, and cattle are included.
c) Signature sequences for primers will be selected using the following protocol: generate consequences sequences for each gene and/or group of alleles from a specific source, use Primer Express (Applied Biosystems) to design primers from the consensus sequences (Tm of 59°C), ensure specificity of primers by BLASTing against Genbank database (based on the extent of 3’ end perfect matches to other bacterial sequences), and filter for primers with low GC content on the 3’ end. For accompanying studies, 110 virulence genes have been designed, validated, and evaluated in terms of specificity in complex environmental backgrounds. Rules were established that will be used for subsequent design of primers/probes for fecal indictor markers.
Our next step is to test the validity of the above chip using real world samples. By following an iterative procedure involving subtractive hybridization with respect to samples, we expect that the probes serving human specific markers can be identified. After hybridization and validation with environmental samples, we expect to obtain a few markers that are unique to human fecal matter and are always absent in other types of samples. An example of the hybridization signal with subtractive hybridization is shown in Figure 2. The actual hybridization signals will obviously be different from what is shown in Figure 2. The validation step will also be the most time consuming step in the overall scheme because it will require collection, processing, and hybridizing many samples followed by data analysis to extract unique signals. From the experiences gained with the 16S rRNA gene indicator chip, statistical tools have also emerged that can predict the reliability of detected signal based on replication and signal intensity.
Table 1. List of Potential Indicator Organisms
Genus |
Functional and isr* genes |
species |
sequences |
species name |
Bacteroides |
7 |
5 |
43 |
distasonis, forsythus, fragilis, fragsin, vulgatus |
Bifidobacterium |
2 |
13 |
41 |
adolescentis, angulatum, animalis, bivdum, breve, cuniculi, dentium, infantis, lactis, longum, magnum, pseudolongum |
Butyrivibrio |
1 |
1 |
12 |
fibrisolvens |
Clostridium |
9 |
3 |
36 |
beijerinckii, perfringens, tyrobutyricum |
Enterococcus |
10 |
20 |
70 |
aerogenes, avium, casseliflavus, cecorum, durans, columbae, pseudoavium, faeca, faecium, faeciumium, hirae, malodoratus, mundtii, pallens, raffinosus, ratti, saccharolyticus, solitarius, sulfureus, villorum |
Escherichia |
8 |
1 |
63 |
coli |
Eubacterium |
1 |
1 |
1 |
ramulus |
Fusobacterium |
1 |
10 |
29 |
canifelinum,mortiferum,naviforme,necrogenes,nucleatum, periodonticum, russii,simiae,ulcerans,varium |
Lactobacillus |
5 |
22 |
77 |
acidophilus, amylovorus, casei, crispatus, curvatus, delbrueckii, fermentum, frumenti, gasseri, hamsteri, helveticus, johnsonii, paracasei, paraplantarum, pentosus, plantarum, reuteri, rhamnosus, ruminis, sakei, suntoryeus, zeae |
Lactococcus |
1 |
1 |
4 |
lactis |
Ruminococcus |
6 |
4 |
15 |
albus, flavefaciens, gnavus, hansenii |
Salmonella |
1 |
2 |
2 |
enterica, typhimurium |
Shigella |
4 |
3 |
10 |
boyd, flexneri, sonnei |
Streptococcus |
8 |
23 |
84 |
agalactiae, alactolyticus, anginosus, bovis, canis, constellatus, dysgalactiae, equinus, gordonii, intermedius, lutetiensis, macedonicus, mitis, mutans, oralis, parasanguinis, parasanguis, pneumoniae, porcinus, pyogenes, salivarius, suis, thermophilus |
Total: 14 genus |
64 |
109 |
487 |
|
* 16S-23S intergenic spacer region |
Table 2. List of potential indicator marker genes
Gene name and No. of sequences* | |||||||||||||
Bacteroides |
Bifidobacterium |
Clostridium |
Enterococcus |
Lactobacillus |
Ruminococcus |
Streptococcus |
|||||||
bft |
7 |
isr |
19 |
cirA |
2 |
VanD |
7 |
acdA |
1 |
albB |
1 |
cylE |
4 |
bspA |
1 |
recA |
22 |
cloA |
1 |
ace |
6 |
acdT |
2 |
celA |
2 |
gtfB |
7 |
cfiA |
17 |
|
|
cpa |
6 |
as-48 |
3 |
curA |
4 |
cesA |
1 |
isr |
41 |
cfxA |
2 |
|
|
cpb |
2 |
enlA |
1 |
isr |
63 |
endB |
1 |
mutB |
3 |
nanH |
3 |
|
|
cpb2 |
7 |
entP |
4 |
recA |
7 |
recA |
1 |
ply |
4 |
pm |
1 |
|
|
cpe |
6 |
esp |
4 |
|
|
rumA |
9 |
recA |
16 |
recA |
12 |
|
|
etx |
5 |
gelE |
2 |
|
|
|
|
sda |
3 |
|
|
|
|
isr |
5 |
hyl |
1 |
|
|
|
|
var |
6 |
|
|
|
|
recA |
2 |
isr |
26 |
|
|
|
|
|
|
|
|
|
|
|
|
recA |
16 |
|
|
|
|
|
|
* Includes genes related to specific functions, marker genes, and 16S-23S intergenic spacer region |
Figure 2. A microfluidic biochip showing a few positive spots that are unique to one sample (circles) vs. many that are present in other types of samples (triangles).
2. VIRUS MICROARRAY
The virus microarray was designed to detect sequences from the group of viruses that are known or are suspected to cause diseases in drinking water. This group comprises primarily of the enteroviruses but also includes several other viral families, for example, the Hepatitis A virus and Hepatitis E virus, Sapovirus, Norwalk virus, Rotavirus etc. Table 3 shows the 27 major families of viral pathogens chosen for the chip, the reference genetic sequences used to perform the analysis (listed by accession number) and the sequence length. Wherever possible, complete viral genomes were chosen for probe design, however in a few instances, such information was lacking and whatever sequence information available was used instead e.g the three human Rotavirus families were represented by their viral protein 4 and 7 (VP4 and VP7 respectively), Toroviruses were represented by the genes encoding hemagglutin-esterase and nucleocapside protein mRNA, picobirnaviruses were represented by its RNA dependent RNA polymerase sequence and the first segment of an as-yet-unknown gene.
This approach of using representative genome sequences for probe design was adopted to provide as broad a chance of identifying a target virus from among these twenty three target groups as possible while at the same time being able to differentiate between these groups as specifically as possible.
Table 3. Virus classes, sequence description and Genbank Accession no.
Virus | Sequence description |
Type of genome |
Accession no. |
Sequence length (bp) |
Hepatitis A virus |
complete genome |
ssRNA positive no DNA stage |
NC_001489 |
7478 |
Hepatitis E virus |
complete genome |
ssRNA positive no DNA stage |
NC_001434 |
7176 |
Human adenovirus A |
complete genome |
dsDNA |
NC_001460 |
34125 |
Human adenovirus B |
complete genome |
dsDNA |
NC_004001 |
34794 |
Human adenovirus C |
complete genome |
dsDNA |
NC_001405 |
35937 |
Human adenovirus D |
complete genome |
dsDNA |
NC_002067 |
35100 |
Human adenovirus E |
complete genome |
dsDNA |
NC_003266 |
35994 |
Human adenovirus F (Adenovirus type 40) |
complete genome |
dsDNA |
NC_001454 |
34214 |
Human adenovirus F (Adenovirus type 41) |
Complete genome |
dsDNA |
DQ315364 |
34189 |
Norwalk virus |
complete genome |
ssRNA positive no DNA state |
NC_001959 |
7654 |
Sapovirus |
complete genome |
ssRNA positive no DNA stage |
NC_010624 |
7458 |
Human enterovirus A |
complete genome |
ssRNA positive no DNA stage |
NC_001612 |
7413 |
Human enterovirus B |
complete genome |
ssRNA positive no DNA stage |
NC_001472 |
7389 |
Human enterovirus C |
complete genome |
ssRNA positive no DNA stage |
NC_001428 |
7401 |
Human enterovirus D |
complete genome |
ssRNA positive no DNA stage |
NC_001430 |
7390 |
Human enterovirus E |
complete genome |
ssRNA positive no DNA stage |
NC_003988 |
7374 |
poliovirus |
complete genome |
ssRNA positive no DNA stage |
NC_002058 |
7440 |
Table 3. Virus Classes, sequence description and Genbank Accession no. (Cont.)
Virus | Sequence description |
Type of genome |
Accession no. |
Sequence length (bp) |
rotavirus A |
VP4 |
dsRNA virus |
AB077766 |
2359 |
|
VP7 |
AB071404 |
1062 |
|
rotavirus B |
VP4 |
dsRNA virus |
AY539857 |
2306 |
|
VP7 |
AY539856 |
814 |
|
rotavirus C |
VP4 |
dsRNA virus |
AB008670 |
2283 |
|
VP7 |
AB008671 |
1063 |
|
coronavirus |
complete genome |
ssRNA positive no DNA stage |
NC_002645 |
27317 |
cytomegalovirus (HH5) |
complete genome |
dsDNA virus |
NC_006273 |
235645 |
torovirus |
hemagglutinin-esterase |
ssRNA positive no DNA stage |
AF159585 |
1251 |
Human torovirus nucleocapsid protein mRNA |
AF024539 |
219 |
||
picobirnavirus |
RNA dependent RNA pol |
dsRNA virus |
AF246940 |
1674 |
segment 1 unknown gene |
AF246941 |
1572 |
||
Human astrovirus |
Complete genome |
ssRNA positive no DNA stage |
NC_001943 |
6813 |
BK polyomavirus |
Complete genome |
dsDNA viruses, no RNA stage |
NC_001538 |
5153 |
JC polyomavirus |
Complete genome |
dsDNA viruses, no RNA stage |
NC_001699 |
5130 |
The picornaviridae family
The picornaviridae comprise the following viral genera of interest for this study – Group A – E enteroviruses, poliovirus, and the Hepatitis A virus. The genome of the picornaviridae comprise of a single positive sense stranded RNA molecule of between 7 and 8.5 kb. Upon entry into the target cell, the picornaviridae genome is immediately translated for the production of the protein and enzymes necessary for the replication and packaging of the daughter viral particles. In other words, unlike other pathogens in which an intact and viable cell, spore or formite is needed to cause an infection, the genome itself is capable of causing and sustaining an infection, though infectivity is greatly increased when packaged in a protein capsid coat.
Enteroviruses and hepatitis A viruses are frequently isolated from sewage impacted freshwaters. Pusch et al. (2005) detected enteroviruses and hepatitis A viruses in between 29-76% and 5-20% respectively of the samples taken from sewage treatment plant influent and river sites downstream of sewage treatment plants.
Figure 3. A typical picornaviridae genomeHepeviridae and Caliciviridae Families
There are a number of other positive single stranded viruses that resemble the picornaviridae in their replication strategy and are also implicated in waterborne viral gastroenteritis. This list includes the Hepatitis E Virus from the Hepeviridae family, coronaviruses and toroviruses from the Nidovirales family. Noroviruses and sapoviruses from the Caliciviridae family. Genetically their primary difference from the picornaviridae are their use of a host-derived 5’ cap instead of the viral protein G (vpg) and in some, the absence of a poly-A tail.
Though there have been no endemic US outbreaks of hepatitis E, major waterborne outbreaks have occurred in Asia and North and East Africa. Large waterborne outbreaks have occurred in developing countries (Corwin et al., 1996, Divizia et al., 2004, Hui et al., 2001, Jothikumar et al., 2000, Pina et al., 1998, Singh et al., 1998, Souto et al., 1997 and Vaidya et al., 2003), in addition to sporadic cases. These waterborne outbreaks have typically been associated with deficient drinking water systems or poor sanitary practices. The primary incidence of hepatitis E infection in the United States is due to returning travelers. During 1989 to 1992, six cases of acute hepatitis E infection was detected in the United States from persons who had returned from international travel (CDC, 1993.)
Noroviruses, previously designated Norwalk, Norwalk-like and small round structured viruses (SRSVs) are a major group of emerging waterborne viral pathogen. CDC estimates that 23 million cases of acute gastroenteritis are due to norovirus infection, and it is now thought that at least 50% of all foodborne outbreaks of gastroenteritis can be attributed to noroviruses. Among the 232 outbreaks of norovirus illness reported to CDC from July 1997 to June 2000, 57% were foodborne, 16% were due to person-to-person spread, and 3% were waterborne (CDC, 2001). Most foodborne outbreaks of norovirus illness are likely to arise though direct contamination of food by a food handler immediately before its consumption. Outbreaks have frequently been associated with consumption of cold foods, including various salads, sandwiches, and bakery products. Liquid items (e.g., salad dressing or cake icing) that allow virus to mix evenly are often implicated as a cause of outbreaks. Food can also be contaminated at its source, and oysters from contaminated waters have been associated with widespread outbreaks of gastroenteritis. Other foods, including raspberries and salads, have been contaminated before widespread distribution and subsequently caused extensive outbreaks. Waterborne outbreaks of norovirus disease in community settings have often been caused by sewage contamination of wells and recreational water.
Reoviridae family
Unique among the viruses that cause viral gastroenteritis are the double stranded RNA viruses from the Reoviridae family – the human rotaviruses group A to C. Rotavirus infection is thought to be the predominant cause of infant diarrhea disease especially in developing countries. It is estimated that as many as 1 billion cases of diarrhea among young children 5 years old and below occurs worldwide annually, with rotavirus being predominantly the etiological agent (Parashar et al., 2003a). In addition as many as 100 deaths annually occur due to rotavirus infections in the US and over 2.1 million deaths among children 5 years and below worldwide annually (Parashar et al., 2003b).
Its genome consists of 11 segments of double stranded RNA. When capsulated it is 70nm in diameter. It has a distinctive “wheel” shape when viewed under the electron microscope.
RNA viruses
An as-yet-unclassified double stranded RNA virus suspected of causing waterborne disease is the picobirnavirus (Cascio et al. 1996). Its name is derived from pico (Latin: small), bi (Latin: two for its bisegmented nature), RNA virus Little is currently known of this virus and its ability to cause disease in humans, primarily due to its infrequent occurrence and detection. It’s chief importance is it detection among immuno compromised patients suffering from diarrhea (Gonzalez et al., 1998).
Human cytomegalovirus and adenoviruses
The last two families of viruses whose sequences were used to design the chip are the Human Herpesvirus 5 (Human cytomegalovirus) and Adenoviruses. These viruses are characterized by their large genomes of approximately 35 kbp for the adenoviruses and 240 kpb for the human cytomegalovirus (CMV). The cytomegalovirus has received prominence in recent years due to its isolation in immuno-compromised individuals suffering from diarrhea (Petric, 1999).
Human astroviruses
Astroviruses are small (28-30nm) non-enveloped, positive sense, single stranded RNA viruses. They are transmitted via the fecal oral route and have been reported to have caused outbreaks of gastroenteritis, especially among children, though not as frequently as rotavirus or adenovirus (Gabbay et al, 2006; Sirinavin et al, 2006). Maunula et al (2004) reported the outbreak of gastroenteritis in Helsinki that was traced to the presence of noroviruses and astroviruses being present at a wading pool, demonstrating the possibility of acquiring an astrovirus infection through recreational contact with polluted water.
Human polyomaviruses
Polyomavirus is the sole genus of viruses within the family Polyomaviridae. Polyomaviruses are DNA-based (double-stranded DNA,~5000 base pairs,circular genome), small (40-50 nanometers in diameter), and icosahedral in shape, and do not have a lipoprotein envelope. They are potentially oncogenic (tumor-causing); they often persist as latent infections in a host without causing disease, but may produce tumors in a host of a different species, or a host with an ineffective immune system. The name polyoma refers to the viruses' ability to produce multiple (poly-) tumors (-oma).
There are two polyomaviruses found in humans: JC virus, which can infect the respiratory system, kidneys, or brain (sometimes causing the fatal progressive multifocal leukoencephalopathy in the latter case), and BK virus, which produces a mild respiratory infection and can affect the kidneys of immunosuppressed transplant patients. Both viruses are very widespread: approximately 80 percent of the adult population in the United States have antibodies to BK and JC.
There has been no documented outbreaks of waterborne polyomavirus disease thus far. But polyomaviruses have been detected in human sewage and urine (Bofill-Mas and Girones, 2001) and its potential as a waterborne pathogen has been speculated (Bofill-Mas et al, 2001;2003).
Detection of viral pathogens in water
In order to detect the low numbers of viruses commonly present in surface and groundwaters, large volumes of water have to be concentrated to a small enough sample volume for processing in the lab. Commonly this involves the passage of surface and groundwaters through an electrostatic filter which adsorb the viruses unto its charged surfaces. The filters are then eluted in the laboratory by altering the surface protein charges of the viruses to force them to desorb from the electrostatic filters. Viruses are then concentrated by a combination of high speed centrifugation and the addition of flocculating agents. This method has been successfully used to concentrate volumes in excess of 200 liters of surface water and 1000 liters of groundwater to less than 30 milliliters of virus concentrate. Raw sewage, containing much higher levels of viruses, does not require such large sample volumes to be collected. Between 5-6 liters of raw sewage can be inexpensively concentrated by high speed centrifugation and the addition of flocculating agents.
The conventional method of detecting viral pathogens from the virus concentrate has been the use of cultured animal cells. There exist a number of animal cell lines routinely employed to culture enteric viruses. Viruses are then detected due to their production of cytopathic cell effects (CPE) in the infected cell culture. This method, however, is time consuming and it is unable to detect viruses that are non-culturable. Some of the important viral pathogens that cannot be detected via cell culture include the noroviruses, toroviruses, and coronaviruses. Molecular methods, notably reverse transcription polymerase chain reaction (RT PCR) methods have been successfully employed to overcome this limitation. However the sensitivity of detection was generally found to be low and most of the common concentration methods for viruses co-concentrated inhibitors that interfered with the PCR reaction.
More recently, PCR methods have been integrated with cell culture to overcome these problems. Integrated Cell Culture Polymerase Chain Reaction (ICC-PCR) has allowed the detection of a wide range of enteric viruses but without the long culture times traditionally associated with cell culture methods. But a limitation of PCR and other similar amplification methods is that they can only detect a limited number of targets within a single tube. Thus the routine monitoring for the most common viruses that cause diarrhea is not feasible. In addition, even with degenerate primers targeted to amplify a wide selected of viruses, inherent PCR bias might skew the results and there is a limit to how many targets may be detected within a multiplex setting.
A microarray approach on the other hand can potentially be used to detect the gene sequences of a large number of viral targets in a single reaction. The use of multiple probes for a single viral target has the two-fold benefit of increasing specificity while reducing the likelihood that a mutation in the viral genome will result in false negatives. Unlike PCR, in which the specificity of detection is the result of the selective binding of primers to nucleic acid sequences followed by subsequent amplification, our approach in the viral microarray is to use random six-base nucleotide primers (random hexamers) to non-specifically label the sample with amino allyl dUTP which can then be coupled to fluorescent dye molecules. These fluorescently labeled strands of nucleic acid can then be specifically detected using probes bound to the silica-based microarray.
Probe design
Probes were designed using the OligoArray version 2.1 software available from http://berry.engin.umich.edu/oligoarray2_1/ Exit
The probes were designed to conform to the following specifications:
- Max Tm: 75 °C (except for torovirus: 80 °C)
- Min Tm: 70 °C (except for torovirus: 65 °C)
- Max GC: 60%
- Min GC: 40%
- Max temp for secondary structure: 45 °C
- Max temperature for cross hybridization: 45 °C
Probes were designed from the positive strand of the genetic sequence. The local BLAST database against which the probes were compared against comprised of all the probe sequences in both positive and negative sense strands and also the sequences that showed a large degree of homology to non-specific sequences as determined by a MEGABLAST search using the following criteria (database: nr; E value:10; Wordsize: 11). This allowed the OligoArray software to filter out non-specific gene sequences from the probes designed.
A total of 780 specific probes were designed targeting the 27 viral targets (approximately 30 probes per viral family target). Generating multiple probes for each target family would enhance the reliability of detection.
Microarray construction
An initial batch of three microarray chips were synthesized by the University of Michigan Engineering Machine shop to specifications determined by Dr Gulari. The microarray chip format used for the virus chip was a 68 by 119 array with a potential for containing a maximum of 8092 wells. 4050 wells were randomly populated with 5 copies of each of the 780 designed probes representing 50% of the chip capacity. Multiple copies of probes were used to provide technical replication of the signals.
Probes were synthesized in-situ in an automated process similar to making oligonucleotides on a DNA synthesizer. The major difference in the process is the use of a photo-generated acid (PGA) rather than an acid in the DMT deprotection step to control the parallel synthesis. This deprotection is initiated by directing light at selected three-dimensional nano-chambers in microfluidic chips. In a synthesis cycle, upon light activation, acid forms in seconds, removing the DMT group. An incoming phosphoramidite nucleoside monomer is then coupled to the growing oligonucleotide chain. The synthesis cycle is repeated for each additional monomer until an array of thousands of oligonucleotides in a microfluidic chip is formed.
Arrays of oligonucleotides are made by in situ coupling of DMT protected nucleotide monomers at selected reaction sites according to the sequences of the oligonucleotides at each synthesis cycle. The process uses computer generated light patterns to control a projection device (similar to a seminar presentation using a powerpoint file), which in turn projects a light pattern onto the chip at each reaction cycle to create a specific chip reaction pattern. The localized light energy generates the PGR allowing selective deprotection; only these deprotected sites couple with the incoming monomer.
These synthesis cycles are repeated to produce the desired oligonucleotide arrays. This digital photolithography process avoids the expensive and time-consuming photomasks used in conventional photolithographic processes and, more importantly, it enables flexibility and enhances efficiency for oligonucleotide array synthesis (Figure 4).
Figure 4. Light beams from a UV-Vis lamp is controlled by a microprocessor to directed portions of the chip. Incidence of a light beam on a section of a chip causes the formation of acid ions which deprotect the site and allows probe elongation. Incremental addition of probe bases results in the synthesis of complete probes of desired length and base sequence.
Sample Processing
Viruses were extracted from cell culture supernatant by scraping the cells of the flask surface using a cell scraper. The viruses were concentrated using an Amicon Ultra 100k™ ultrafiltration column (Millipore Inc. Billerica, MA) following manufacturer’s instruction. Viruses were extracted using the QIAamp Viral Nucleic Extraction kit from Qiagen which extracts both viral RNA and viral DNA.
Viral nucleic acid extracts were then divided in half to be processed for RNA and for DNA viruses. RNA was labeled using a modified BioPrime labeling protocol and TIGR microarray protocol (http://pga.tigr.org/sop/M004_1a.pdf Exit ). Briefly, between 2-5μg of template RNA is used to generate a first strand cDNA molecule using reverse transcriptase and incubation at 45°C. Reverse transcriptase uses RNA as a template and synthesizes a complementary strand of DNA (complementary DNA). The synthesized cDNA strand was then coupled to NHS-ester Cy dye available from Amersham Biosciences.
DNA targets were generated in a manner similar to that used for RNA targets. However, the large fragment of the DNA polymerase I enzyme (Klenow fragment) was used instead of reverse transcriptase to generate the modified DNA daughter strands (Figure 5). Less template is required for DNA targets (between 0.5 and 1μg of template DNA) and incubation is carried out at 37°C as opposed to 45°C. The modified DNA daughter strand is then coupled to NHS-ester Cy dye. Different colored Cy dyes are used to differentially label RNA and DNA targets.
The labeled DNA and RNA targets were then hybridized together or separately to the microarray. They can then be detected using a microarray scanner where they will fluoresce at their own specific wavelengths.
Microarray hybridization
Microarray hybridization is performed using a Xeotron™ microfluidic hybridization station. Hybridization is initially carried out at 20°C to allow the target DNA to bind to the probes on the chip for 16-18 hours. The array is then subsequently washed at 1°C incremental temperatures with a flow rate of 500μl/min for 1.4 minutes in the presence of hybridization wash buffer (10mM Na2HPO4 5mM EDTA pH 6.6). The microarray is scanned between each wash cycle to generate a melting curve.
Microarray scanning and data collection
Microarray scanning is carried out using the GenePix 4000B Microarray Scanner (Molecular Devices Inc. Sunnyvale, CA). The software used to analyze the scanned images was Genepix Pro 5.0. Data was collected and normalized using filters built into the software. Data was then graphed using the Microsoft Excel spreadsheet program.
Figure 5. Top. Experimental steps for the labeling of RNA targets to be hybridized to the array. Bottom. Experimental steps for the labeling of DNA targets to be hybridized to the array.
Initial results
Results were obtained for the hybridization of labeled poliovirus LSC-1 to the viral microarray. The 30 probes generated for poliovirus were found to be completely specific (30/30 hybridization). There was no significant cross hybridization with non-poliovirus probes (0/750) throughout the melting curve temperature range (25°C - 60°C). Signal intensities for poliovirus specific probes were several orders of magnitude greater than non-poliovirus probes. 2 additional repeat hybridizations generated similar results.
Figure 5. Poliovirus LS-C-1 hybridization results. There was complete specificity of probes for poliovirus and no non-specific signals arising from non-poliovirus probes were observed.
Results were also obtained for the hybridization of labeled Adenovirus type 40 and type 41 to the viral microarray. Twenty-four out of sixty of the probes generated for Adenovirus type 40 and 41 were found to be specific for their respective target viruses. One probe which was present on the genomes of both Adenovirus type 40 and 41 showed equal levels of signals from both its target viruses. There was no significant cross hybridization with non-poliovirus probes (0/720) (Figure 6). 2 additional repeat hybridizations generated similar results.
Figure 6. Adenovirus type 40 and 41 hybridization results. There was complete specificity of probes for Adenovirus and no non-specific signals arising from non-Adenovirus type 40/41 probes.
Between five and six liters of raw human sewage from a local wastewater treatment plant was collected and concentrated for viruses using beef extract and Iron (III) chloride as a flocculating agent. Viral concentrates were incubated on animal cells until the development of cell cyptopathic effect, indicative of a viral infection. Viral nucleic acid was extracted and the DNA and RNA was processed for hybridization on the microarray. A snapshot of the viruses present within that sample was obtained (Figure 7). Some of the viral families detected within that sewage sample were Adenovirus A-F (including adenovirus type 40 and 41), human astroviruses, some human enteroviruses, one strain of polyomavirus, and human toroviruses.
Figure 7. Hybridization snapshot of the viruses present in a sample of raw sewage taken from a wastewater treatment plant. Solid bars illustrate viruses that were present in higher than background levels or viruses of particular interest in water and wastewater treatment and recreational use.
ENTEROCOCCUS INDICATOR MICROARRAY
The aim of this work is to develop a microarray that includes probes for Enterococcus sequences to characterize the occurrence of these bacteria in water. Enterococcus bacteria are members of the Group D Streptococci and are characterized by their ability to grow at low and elevated temperatures (10°C and 45°C), at elevated pH (9.5), and in 6.5% NaCl. This group includes 27 species, of which E. faecalis and E. faecium are the most prevalent in water. Sources of Enterococci include the feces of mammals and birds, and they have also been isolated from algae mats and plants.
A dataset has been developed that includes 147 Enterococcus DNA sequences for input to a microarray aimed at species and source identification of Enterococcus in water. These sequences include six genus-specific sequences and 141 sequences from 14 different Enterococcus species, the majority of sequences being from E. faecium (65 sequences) and E. faecalis (53 sequences). These sequences were drawn from the published literature and also from the NCBI database. The microarray dataset includes the source of the sequence and, where available, the link to the source and/or related publication. Sequences included are related to a variety of genes and functions, with a focus on bacterial identification, virulence and pathogenicity. Examples include sequences coding for antibiotic resistance, Enterococcal surface protein (Esp), and putative pathogenicity islands.
The Esp protein as a target
A putative human-derived marker that is associated with the presence of fecal contamination from human sources has been identified (Scott et al. 2005). The enterococcal surface protein, Esp, originally found in Enterococcus faecalis, has been associated with increased virulence in human infections. Enterococcus faecalis is a leading etiological agent of urinary tract infections and the Enterococcal surface protein (Esp) is shown to contribute to colonization and persistence of E. faecalis in the urinary tract. Nucleotide sequence analysis of polymerase chain reaction (PCR)-amplified esp DNA from water samples taken in the aftermath of hurricane Katrina has been shown to be 100% homologous to human specific esp gene sequences. The presence of this marker sequence in a water sample thus indicates the contribution of a human source of contamination. The next steps in the development of this microarray are to design the probes from this sequence dataset and incorporated unto the indicator microarray discussed in previous sections.
Summary to Date
The bacterial indicator microarray requires further development and validation and also the incorporation of the enterococcus gene-probes. The virus microarray has been partially validated using type strains of poliovirus LSC-1, Adenovirus type 40 and type 41 and will be further validated using out type strains from ATCC collections and from donated positive patient specimens. Hybridization of sewage-derived viral nucleic acid with the microarray has begun and will be continued in order to develop snapshots of the circulating viruses within the community over time. Further validation of the virus microarray will also be carried out using environmental isolates. Thus far, we have selected gene targets to encompass the microorganisms of interest to water safety. Second, designing probes for the unique indentification of each target microorganism has been undertaken for the virus microarray and will shortly be achieved for the bacterial indicator microarray and enterococci. The virus microarray has been tested against poliovirus LSC-1, Adenovirus type 40 and 41 and has found to be very specific. A methodology for processing samples for analysis on the viral microarray has also been developed.
Future Activities:
The indicator biochip will be tested with real world samples and iteratively screened to derive specific probes for detection of human fecal pollution in water. The virus biochip will be tested with additional type virus strains and samples of human sewage passaged through cell culture.
References:
Bofill-Mas S, Girones R. 2001. Excretion and transmission of JCV in human populations. J Neurovirol. 7(4):345-9.
Bofill-Mas S, Formiga-Cruz M, Clemente-Casares P, Calafell F, Girones R. 2001.Potential transmission of human polyomaviruses through the gastrointestinal tract after exposure to virions or viral DNA. J Virol. 75(21):10290-9.
Bofill-Mas S, Clemente-Casares P, Major EO, Curfman B, Girones R. 2003. Analysis of the excreted JC virus strains and their potential oral transmission. J Neurovirol. 9(4):498-507.
Cascio A, Bosco M, Giammanco A, Ferraro D, Arista S. 1996. Identification of picobirnavirus from feces of Italian children suffering from acute diarrhea. Eur J Epidemiol. 12: 545-7.
CDC. 1993. Hepatitis E Among U.S. Travelers, 1989–1992. MMWR 42(1). pp 1-4.
CDC. 2001. “Norwalk-like viruses:” public health consequences and outbreak management. MMWR 2001;50(No. RR-9):1-10.
Corwin, A.L., H.B. Khiem, E.T. Clayson, K.S. Pham, T.T. Vo, T.Y. Vu, T.T. Cao, D. Vaughn, J. Merven, T.L. Richie, M.P. Putri, J. He, R. Graham, F.S. Wignall and K.C. Hyams,. 1996. A waterborne outbreak of hepatitis E virus transmission in southwestern Vietnam, Am. J. Trop. Med. Hyg. 54 (1996) (6), pp. 559–562
Divizia, M., R. Gabrieli, D. Donia, A. Macaluso, A. Bosch, S. Guix, G. Sanchez, C. Villena, R.M. Pinto, L. Palombi, E. Buonuomo, F. Cenko, L. Leno, D. Bebeci and S. Bino,. 2004. Waterborne gastroenteritis outbreak in Albania, Water Sci. Technol. 50 (2004) (1), pp. 57–61.
Gabbay YB, Chamone CB, Nakamura LS, Oliveira DS, Abreu SF, Cavalcante-Pepino EL, Mascarenhas JD, Leite JP, Linhares AC. 2006. Characterization of an astrovirus genotype 2 strain causing an extensive outbreak of gastroenteritis among Maxakali Indians, Southeast Brazil. J Clin Virol. 37(4): 287-92.
Gonzalez GG, Pujol FH, Liprandi F, Deibis L, Ludert JE. 1998. Prevalence of enteric viruses in human immunodeficiency virus seropositive patients in Venezuela. J Med Virol. 55: 288-92.
Hui, Y., S.A. Sattar, K.D. Murrell, W. Nip and P.S. Stanfield. 2001. Editors, Hepatitis A and E viruses, vol. 2: Viruses, Parasites, Pathogens, and HACCP (2nd ed.), Marcel Dekker, Inc.
Jothikumar, N. R. Paulmurugan, P. Padmanabhan, R.B. Sundari, S. Kamatchiammal and K.S. Rao. 2000. Duplex RT-PCR for simultaneous detection of hepatitis A and hepatitis E virus isolated from drinking water samples, J. Environ. Monit. 2 (6), pp. 587–590.
Leavis et al., “Epidemic and Nonepidemic Multidrug-Resistant Enterococcus faecium”, Emerging Infectious Diseases, Vol. 9, No. 9, September 2003.
Maunula L, Kalso S, Von Bonsdorff CH, Ponka A. 2004. Wading pool water contaminated with both noroviruses and astroviruses as the source of a gastroenteritis outbreak. Epidemiol Infect. 132(4):737-43.
Parashar, U.D., Glass, R.I., 2003a. Viral causes of gastroenteritis. In: Desselberger, U., Gray, J. (Eds.), Viral Gastroenteritis. Elsevier Science, Amsterdam, pp. 9–22.
Parashar UD, Hummelman EG, Bresee JS, Miller MA, Glass RI. 2003b. Global illness and deaths caused by rotavirus disease in children. Emerg Infect Dis. 9(5):565-72.
Petric M. Caliciviruses, astroviruses and other diarrheic viruses. In: Murray PR, Baron EJ, Pfaller MA, Tenover FC, Yolken RH, eds. Manual of clinical microbiology. Washington: ASM Press, 1999: 1005-13.
Pina, S., J. Jofre, S.U. Emerson, R.H. Purcell and R. Girones. 1998. Characterization of a strain of infectious hepatitis E virus isolated from sewage in an area where hepatitis E is not endemic, Appl. Environ. Microbiol. 64 (11), pp. 4485–4488.
Pusch D, Oh DY, Wolf S, Dumke R, Schroter-Bobsin U, Hohne M, Roske I, Schreier E. 2005. Detection of enteric viruses and bacterial indicators in German environmental waters. Arch Virol. 150(5):929-47.
Scott et al., 2005 "Potential Use of a Host Associated Molecular Marker in Enterococcus faecium as an Index of Human Fecal Pollution." Environmental Science & Technology 39: 283-287.
Singh, V., M. Raje, C.K. Nain and K. Singh. 1998. Routes of transmission in the hepatitis E epidemic of Saharanpur, Trop. Gastroenterol. 19 (3), pp. 107–109.
Sirinavin S, Techasaensiri C, Okascharoen C, Nuntnarumit P, Tonsuttakul S, Pongsuwan Y. 2006. Neonatal astrovirus gastroenteritis during an inborn nursery outbreak. J Hosp Infect. 64(2):196-7.
Weber R, Ledergerber B, Zbinden R et al. Enteric infections and diarrhea in human immunodeficiency virus-infected persons: prospective community-based cohort study. Arch Intern Med 1999; 159: 1473-80.
Journal Articles on this Report : 1 Displayed | Download in RIS Format
Other project views: | All 14 publications | 5 publications in selected types | All 4 journal articles |
---|
Type | Citation | ||
---|---|---|---|
|
Stedtfeld RD, Baushke S, Tourlousse D, Chai B, Cole JR, Hashsham SA. Multiplex approach for screening genetic markers of microbial indicators. Water Environment Research 2007;79(3):260-269. |
R831628 (2006) |
|
Supplemental Keywords:
biochips, microarrays, fecal pollution, water,, RFA, Scientific Discipline, Health, PHYSICAL ASPECTS, Water, Ecosystem Protection/Environmental Exposure & Risk, Health Risk Assessment, Environmental Chemistry, Monitoring/Modeling, Risk Assessments, Physical Processes, Environmental Monitoring, Drinking Water, microbial contamination, monitoring, measurement , microbial risk assessment, biochip, microbiological organisms, detection, exposure and effects, virulence factor activity relationships, virulence factor biochip, bacteria monitoring, exposure, other - risk assessment, E. Coli, human exposure, microbial risk management, microorganism, measurement, assessment technology, drinking water contaminants, other - risk managementProgress and Final Reports:
Original AbstractThe perspectives, information and conclusions conveyed in research project abstracts, progress reports, final reports, journal abstracts and journal publications convey the viewpoints of the principal investigator and may not represent the views and policies of ORD and EPA. Conclusions drawn by the principal investigators have not been reviewed by the Agency.