Previous Article | Next Article ![]()
Infection and Immunity, March 2004, p. 1496-1503, Vol. 72, No. 3
0019-9567/04/$08.00+0 DOI: 10.1128/IAI.72.3.1496-1503.2004
Copyright © 2004, American Society for Microbiology. All Rights Reserved.
Laboratory for Foodborne Zoonoses, Population and Public Health Branch, Health Canada, Guelph, Ontario N1G 3W4, Canada,1 Center for Vaccine Development, University of Maryland School of Medicine, Baltimore, Maryland 21201,2 Department of Pathology and Molecular Medicine, McMaster University, Hamilton, Ontario L8N 3Z5, Canada3
Received 9 June 2003/ Returned for modification 18 August 2003/ Accepted 13 November 2003
|
|
|---|
|
|
|---|
The scientific basis for the apparent differences in virulence between different serotypes is not known. Growing evidence suggests that major differences in virulence between groups of strains in bacteria such as E. coli, Salmonella enteritidis, and Helicobacter pylori may be related to the presence of specific pathogenicity islands (PAIs) (11, 19). One example of such a PAI is the locus of enterocyte effacement (LEE) which encodes the structural, accessory, effector, and regulatory molecules necessary for the development of the characteristic attaching and effacing cytopathology on enterocytes by some VTEC strains (27, 30). However, LEE cannot alone explain virulence differences between VTEC serotypes because the LEE-positive serotype O157:H7 is associated with outbreaks and HUS much more commonly than other LEE-positive serotypes such as O26:H11, O103:H2, and O111:NM (10, 30), and some LEE-positive serotypes from bovines have never been associated with human disease (43, 44). Furthermore, LEE-negative serotypes such as O113:H21 are also associated with HUS (10, 16, 30). These observations suggest that other hitherto unknown factors, perhaps on other PAIs, are also involved in modulating the virulence of VTEC.
The genome sequences of VTEC O157:H7 strains EDL933 (35) and Sakai (13) contain several additional putative PAIs (35), including the 87-kb O island 48 (OI-48) and the 23-kb OI-122 (35). OI-48 is duplicated (as OI-43) in the EDL933 genome (35), whereas it is present as a single copy (SpLE1) in the Sakai genome (13). OI-122 comprises 26 open reading frames (ORFs), including four putative virulence genes, Z4321, Z4326, Z4332, and Z333, that encode proteins with significant homology to the Salmonella enterica serovar Typhimurium PagC (pagC) (36) (Z4321), Shigella flexneri enterotoxin (senA) (31) (Z4326), and the enterohemorrhagic E. coli factor for adherence (efa1) (32) (Z4332 and Z4333).
During a study to investigate the distribution of these OI-122 genes in different VTEC serotypes, we observed that VTEC strain CL3 (serotype O113:H21) was positive for gene Z4321 but negative for Z4326, Z4332, and Z4333 (21). The objective of the present study was to investigate whether Z4321 is part of an incomplete OI-122 or of a novel genomic structure.
|
|
|---|
|
View this table: [in a new window] |
TABLE 3. Distribution of Z1640 and S1 in different VTEC serotypes
|
DNA techniques. Standard techniques were used for DNA extraction, purification, analysis, and PCR (2, 38). Genomic DNA was isolated with the Genomic-tip 100/G (Qiagen).
Genome walking. Adjacent DNA sequences upstream and downstream of Z4321 in the CL3 genome were investigated with the Clontech Universal GenomeWalker kit (Clontech Laboratories, Inc.). Details of methodology are accessible from the user manual at http://www.clontech.com/. Briefly, CL3 genomic DNA was digested with one of four different blunt-end restriction enzymes (provided by the manufacturer). The purified restricted DNA fragments were ligated to an adaptor (supplied by the manufacturer) and became part of four genome-walking libraries generated by the four different restriction enzymes. Gene-specific primers were designed and used in combination with adaptor-specific primers to amplify large genomic segment adjacent to genes of interest by long-distance PCR with the Expand System kit (Roche Diagnostics, Mannheim, Germany) in the PCR Express thermal cycler (ThermoHybaid, Middlesex, United Kingdom). PCR products were gel purified with the Qiagen QIAquick gel extraction kit for sequencing. The terminal portions of the sequenced DNA regions were used to design primers for subsequent sequential genome-walking steps either upstream or downstream.
DNA sequencing and open reading frame (ORF) prediction and annotation. DNA sequencing was performed by the Laboratory Service Division, University of Guelph, on an Applied Biosystems ABI Prism 377 DNA sequencer. DNA sequence data were aligned, edited, and assembled into contiguous sequence with the Omiga program (Oxford Molecular Ltd.). Putative ORFs larger than 150 bp were identified with GeneMark (26), Lasergene (DNAStar Inc.), and Glimmer 2.02 (6) applications. ORFs homologous to sequences in the EDL933 genome were annotated according to the EDL933 sequence (35). Amino acid homology with other ORFs in the public databases were sought with the BlastN, BlastP and BlastX programs (1) at the National Center for Biotechnology Information website (http://www.ncbi.nlm.nih.gov).
Test of contiguity of adjacent genome-walking sequences by Southern hybridization. To ensure contiguity of the sequences from the genome-walking library, Southern blotting was applied to BsiWI and EcoRI restriction fragments of the CL3 genome to determine if the hybridized fragments on the CL3 genome matched the sizes of fragments predicted from the genome-walking sequences. In practice, two samples of CL3 genomic DNA (2.5 g) were digested with BsiWI and EcoRI, respectively, and separated by agarose gel electrophoresis. DNA fragments were transferred overnight by capillary transfer onto positively charged nylon membranes (Roche Diagnostics) (38). Digoxigenin-labeled probes were synthesized with a PCR digoxigenin probe synthesis kit (Roche Diagnostics) according to the manufacturer's instructions. The digoxigenin nucleic acid detection kit (Roche Diagnostics) was used to detect hybridized bands.
Probe B1 (555 bp), generated with primers B1F (5'GTAAAACCCGGTGATGTGAAC3') and B1R (5'CGTACTCTGTGTGACGCCCTG3'), specific for the region spanning the second BsiWI restriction site (Fig. 1), was used to hybridize the BsiWI blot. Probes E1 (521 bp) and E2 (494 bp) were used to probe the EcoRI blot separately. Probe E1 was generated with primers E1F (5'-CTGAAACCGTTTGTGGCTGG-3') and E1R (5'-GACATACAGAAAGCGGACGAG-3'), specific for the region spanning the second EcoRI restriction digestion site (Fig. 1). Probe E2 was generated with primers B2F (5'-GAATCTACGGCGATGCTG-3') and B2R (5'-GGTTTCTTCCACGGTACG-3'), specific for the region following the first EcoRI site.
![]() View larger version (27K): [in a new window] |
FIG. 1. Genetic and physical map of the sequenced region of the genomic island of E. coli O113:H21 strain CL3. BsiWI and EcoRI restriction sites and the sizes and locations of DNA fragments amplified by genome walking PCR are marked. Southern hybridization probes B1, E1, and E2 are indicated. EDL933 genes are indicated by grey shading, and new genes are indicated by black shading. Arrows indicate the size, location, and transcription orientation of ORFs present in the region. Grey boxes indicate the position of Z1640 gene fragments, which are separated by genomic segments. White boxes indicate the position of the 190-bp direct repeat H8 (H8DR). The peak diagram is the G+C content of the region determined with a 101-nucleotide window. The line at 51% represents the average G+C content of the E. coli K-12 chromosome. GS-I, genomic segment I; GS-II, genomic segment II; OI-122, the fragment from OI-122; H8DR, H8-homologous direct repeat; DR12, the 12-bp direct repeat GCAGGGGCTGGA.
|
|
View this table: [in a new window] |
TABLE 2. Primers used in RT-PCR for detecting gene expression
|
G+C content analysis. A plot of G+C content along the 27,297-bp sequence was generated by the sliding 101-nucleotide window method (29).
Nucleotide sequence accession number. The nucleotide sequence reported in this paper has been deposited in the GenBank database (accession number AY275838).
|
|
|---|
![]() View larger version (98K): [in a new window] |
FIG. 2. Ethidium bromide-stained agarose gel (A) and corresponding Southern hybridization analysis (B) of BsiWI-digested genomic DNA probed with digoxigenin-labeled probe B1, and agarose gel (C) and corresponding Southern hybridization analysis (D) of EcoRI-digested genomic DNA probed with digoxigenin-labeled probe E1. The probes hybridized with E. coli strain CL3 in both blots, but not with K-12. Lane 1, DNA size markers (Invitrogen); lane 2, CL3; lane3, K-12.
|
Features of the sequenced region. Analysis of the sequenced region of the CL3 genome indicates that it is part of a hybrid genomic island that contains segments of two EDL933 genomic islands, OI-48 and OI-122, as well as sequences that show homology to Yersinia pestis genes (Fig. 1). The average G+C content of the whole sequenced region is 49.58%. Its left terminus contains ORFs that show homology to EDL933 OI-48 genes Z1635, Z1636, and Z1637. The right terminal end of the region consists of ORFs that show homology to EDL933 OI-48 genes Z1641, Z1642, Z1643, and Z1644 (Fig. 1). Z1641 and Z1642 are contained in a single, putative helicase gene (S15). Central to the region of this CL3 genomic island is gene Z1640, which is separated into three fragments by two genomic segments (GS-I and GS-II) (Fig. 1). The three Z1640 fragments, Z1640-1, Z1640-2, and Z1640-3, have sizes of 539 bp, 355 bp, and 168 bp, respectively. The region between Z1640-1 and Z-1640-2 consists of a 2,973-bp genomic segment (GS-I), which comprises ORFs S1, S2, and S3. Z1640-1 is part of S1 and Z1640-2 is part of S3 (Fig. 1). There is a 12-bp direct repeat (GCAGGGGCTGGA, part of the Z1640 sequence, referred to as DR12 in Fig. 1) at both ends of GS-I (bp 2427 to 2438 and bp 5412 to 5423).
The region between Z1640-2 and Z1640-3 contains a 19,353-bp genomic segment (GS-II). In GS-II, there are two 190-bp-long direct repeats located at bp 12070 to 12259 (in S4) and at 17447 to 17636 (in S10). These repeats are 90% identical. Each has a 100-bp sequence that shows strong homology (91% identity for the left and 95% identity for the right) to the E. coli reference collection (ECOR3) random amplified polymorphic DNA fragment H8 (GenBank accession number AF127011) (15). They are thus referred to as H8DR (H8-homologous direct repeat) in Fig. 1. The region between the two H8DRs comprises the five ORFs S5 to S9, two of which (S6 and S7) have homology to transposases, suggesting that the region is an insertion sequence-associated element. Adjacent to the latter is the OI-122 region comprising part of Z4322, the complete Z4321, and part of Z4318.
Genes in the sequenced region. (i) EDL933 OI-48 ORFs. The CL3 predicted Z1635, Z1636, Z1637, Z1643, and Z1644 gene products are 97%, 96%, 95%, 97%, and 98% identical, respectively, to their EDL933 counterparts at full length (Table 1). The functions of all these genes are unknown. Z1638 and Z1639 are absent in the CL3 genome. The sequences of the two EDL933 genes Z1641 and Z1642 are contained in one gene, S15, in CL3 (Table 1), which turns out to encode a UvrD/Rep-helicase (Conserved Domain database, abbreviated CD, pfam00580) (Table 1).
|
View this table: [in a new window] |
TABLE 1. Amino acid homologies of the putative ORFs identified in the sequenced region
|
Both predicted products of S3 and S4 are homologous to the putative hemolysin (encoded by gene YPO2490) as well as the putative adhesin (encoded by gene YPO0599) in Y. pestis (33). The YPO2490 and YPO0599 products are large proteins, with 2,535 and 3,295 amino acids, respectively. The first 527 of 616 amino acids of S3 are 53% identical to the putative Y. pestis hemolysin, and the first 557 amino acids are 50% identical to the putative Y. pestis adhesin. S4 is a large gene encoding a protein with 2,343 amino acid residues, of which 2,085 amino acid residues are 40% identical to the YPO2490 product, and 2,123 amino acids are 39% identical to the YPO0599 product. No conserved domain match was found for either the S3 or S4 protein. S5 is a small gene encoding a protein with 74 amino acids in which a 55-amino-acid sequence is identical to the YPO0599 product.
(iii) Putative transposase genes. The S6 product has 79 of 109 amino acids with 38% homology to a putative transposase encoded by gene Y2435 in Y. pestis KIM (7), but no conserved domain is found for this protein (Table 1). The predicted product of S7 is 44.4% aligned with the integrase core domain rve (CD pfam00665), and shows 60% and 55% identity to the carboxyl-terminal region of the TnpA transposase of Pseudomonas syringae (12) and Y. pestis (33), respectively. A region upstream of S7 (from bp 14499 to 14975) encodes the amino-terminal region of the TnpA transposase. A 1-bp insertion or a -1 translational frameshift before the stop codon of the amino-terminal region would make the region from 14,499 to 15,526 bp (S7 and the region upstream of S7) into a complete coding region for a transposase with 100% alignment with the integrase core domain. However, the coding region for the amino-terminal region is not predicted to be an active gene (Fig. 1, Table 1).
(iv) OI-122 homologous ORFs. S10, S11, and S12 are EDL933 OI-122 homologues (Table 1). S10 contains the H8DR-2 sequence and part of the OI-122 segment. Its amino acids 30 to 80 (out of 80) are 94% identical to Z4322 in EDL933. The S11 protein shows 98% homology to the Z4321 product of EDL933 (35). S12 comprises part of the OI-122 segment and its downstream sequence, and the product is 96% identical to Z3218 in 1 to 240 (out of 604) amino acids. But overall, the S12 protein appears to belong to the family of UvrD/Rep-helicases (CD pfam00580), which catalyze ATP-dependent unwinding of double-stranded DNA to single-stranded DNA.
(v) Other ORFs. The S8 product has 117 of 175 amino acids with 27% homology to the FN0835 product of Fusobacterium nucleatum (17). The S9 product has 68 of 91 amino acids 32% like the unknown Bacillus subtilis protein YozI (25). Amino acids 25 to 93 of 101 of the S13 protein are 30% identical to the ST0071 esterase in Sulfolobus tokodaii (23). The first 217 amino acids of 235 of the S14 protein are 39% homologous to the unknown protein encoded by Y2679 in Y. pestis KIM (7).
Expression of genes in the sequenced region. Expression of the predicted genes after CL3 growth in LB broth at 37°C with shaking was determined by reverse transcription PCR. The results showed PCR bands of the predicted sizes (Table 2) for all genes (Fig. 3). Although not visible in Fig. 3, very faint bands of the predicted sizes for Z1637 and S10 were seen. This indicates that the genes are transcribed in the direction predicted and are active, at least at the transcriptional level. This supports our gene predictions for this genomic region.
![]() View larger version (25K): [in a new window] |
FIG. 3. Ethidium bromide-stained agarose gel of reverse transcription PCR products of RNA isolated from E. coli strain CL3. Reverse transcription was carried out with the primer specific to each gene, followed by PCR with specific primers and with the reverse transcription product as the template. PCR without prior reverse transcription was used as a control to confirm that the reverse transcription PCR products were attributable to RNA. Gene designations are indicated above each lane.
|
|
|
|---|
The unique character of PAI ICL3 appears to be due to genomic events related to OI-48 genes. The genes at both ends of the sequenced PAI ICL3 region resemble OI-48 genes Z1635 to Z1644, which, in CL3, are highly conserved and present in the same order as in EDL933. Of particular interest is the observation that one of the OI-48 genes, Z1640, is separated into three fragments, the spaces between which are occupied by genomic segments, GS-I and GS-II, which include a Y. pestis-like hemolysin/adhesin gene cluster and an EDL933 OI-122 segment (Fig. 1).
The initial event in the generation of PAIs is probably the preferential integration of plasmids, phages, conjugative transposons, or cointegrates of these into the chromosome (11, 18). Further evolution of the PAI can occur by a number of processes, including recombination events between insertion sequence elements, direct repeats, and other homologous sequences, leading to deletion or acquisition of DNA and generation of mosaic structures (11, 18, 39). In some cases, whole PAIs can be deleted. For example, mutants exhibiting spontaneous deletion of PAI IC5 on sheep blood agar have been observed in E. coli strain C5 that was isolated from a patient with neonatal meningitis (14). While such processes tend to make PAIs unstable, mutations in genes associated with recombination events, such as insertion sequence elements and direct repeats, or in those associated with mobility, such as transposases and integrases, leading to their inactivation, can serve to stabilize PAIs (11).
One possibility for the development of PAI ICL3 is that it evolved in a stepwise manner through the insertion of different genetic elements. In favor of this is the observation that the GS-I and H8DR-bordered regions are bracketed by direct repeat sequences (Fig. 1), a feature typical of transposon insertions (5), which provides evidence that these elements were acquired as insertions. Furthermore, there is evidence that this region may have become stabilized through the inactivation of a transposase enzyme. Between bp 14499 and 14975 in the PAI ICL3 region, there is a sequence (S7 and its preceding region) that probably encoded an active transposase which is postulated to have facilitated the insertion of the H8DR-bordered segment (Fig. 1) into the region. However, following the acquisition of the segment, deletion of one base pair leading to the creation of a stop codon in the middle of the sequence resulted in truncation of the transposase gene.
An alternative explanation for the aberrant sequence is that it facilitates the regulation of an active transposase through the process of translational frameshifting (4, 40). This would require a -1 frameshift before the stop codon of the amino-terminal region to translate an active transposase enzyme (4). Translational frameshifting occurs with the expression of some transposase genes. However, in CL3 the two parts of the transposase coding region do not overlap, which is a requirement for a -1 frameshift translation (4). Moreover, the coding region for the amino-terminal region is not predicted to be an active gene.
The hemolysin/adhesion cluster on PAI ICL3 represents potential virulence genes for CL3. On the other hand, S1, used as a marker of this cluster, was absent in the most virulent VTEC serotypes that are associated with outbreaks and HUS (Table 3), but was present in less virulent serotypes (O113:H21 and O91:H21) and in about half the cattle strains that have not been associated with human disease (Table 3). Thus, the hemolysin/adhesin virulence genes may have no role in human disease. On the other hand, the fact that the Y. pestis-like adhesin genes are present only in LEE-negative serotypes (Table 3) suggests that they could provide an alternative mechanism to the LEE-mediated attaching and effacting lesion. For example, it could account for the mechanism of adherence of a strain of the LEE-negative serotype O113:H21 strain, which has been shown to adhere to cultured cells in vitro and to stimulate a signal transduction response in a manner that is different from that of serotype O157:H7 strains (9). However, other candidate adhesins have also been proposed for VTEC serotype O113:H21 strains. These include Saa, a plasmid-mediated adhesin (34), and a chromosomal region homologous to EDL933 OI-154 which encodes long polar fimbriae (8).
In contrast to the hemolysin/adhesin gene cluster that is found in strains that appear to be less virulent than E. coli O157:H7 strains, intact Z1640 was present exclusively in O157:H7 strains and other serotypes that are associated with HUS and epidemic disease (Table 3). Thus, Z1640 could, theoretically, be a virulence gene and thus warrants experimental investigation.
We have argued for a stepwise evolution of PAI ICL3 through acquisition of insertions because of the presence of several genetic elements bracketed by direct repeat sequences. However, it is also possible that PAI ICL3 more closely resembles the ancestral, and less virulent, state and that the EDL933 OI-48 arose as part of the more pathogenic serotype O157:H7 through complete or stepwise deletions of the GS-I and GS-II segments, leading to the formation of the putative virulence gene Z1640. Such a scenario could also explain the UvrD helicase-like gene (S15) as being the ancestral state and the two EDL933 ORFs Z1641 and Z1642 as a derived state resulting from the decay of the original full-length coding region.
In conclusion, the CL3 genomic region reported here is a part of a hybrid putative PAI (PAI ICL3) that contains segments of EDL933 OI-48 and OI-122 and other putative virulence factors, including a Y. pestis-like hemolysin/adhesin cluster. OI-122 elements have also been reported to be part of mosaic structures with the LEE PAI in enteropathogenic rabbit E. coli strains 83/89 and RDEC-1 of serotype O15:H- (41, 45) and in some LEE-positive human enteropathogenic and verocytotoxin-producing E. coli strains (28). These studies and ours provide evidence for considerable plasticity in EDL933 genomic islands, which may contribute to the on-going evolution of VTEC.
We thank T. Whittam, F. Jamieson, J. Isaac-Renton, J. Preiksaitis, P. VanCaeseele, S. Aleksic, and P. Desmarchellier for generously providing the VTEC strains used in this study.
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»