ABSTRACT
The attaching-and-effacing (A/E) phenotype mediated by factors derived from the locus of enterocyte effacement (LEE) is a hallmark of clinically important intestinal pathotypes of Escherichia coli, including enteropathogenic (EPEC), atypical EPEC (ATEC), and enterohemorrhagic E. coli strains. Epidemiological studies indicate that the frequency of diarrhea outbreaks caused by ATEC is increasing. Hence, it is of major importance to further characterize putative factors contributing to the pathogenicity of these strains and to gain additional insight into the plasticity and evolutionary aspects of this emerging pathotype. Here, we analyzed the two clinical ATEC isolates B6 (O26:K60) and 9812 (O128:H2) and compared the genetic organizations, flanking regions, and chromosomal insertion loci of their LEE with those of the LEE of other A/E pathogens. Our analysis shows that the core LEE is largely conserved—particularly among genes coding for the type 3 secretion system—whereas genes encoding effector proteins display a higher variability. Chromosomal insertion loci appear to be restricted to selC, pheU, and pheV. In contrast, striking differences were found between the 5′- and 3′-associated flanking regions reflecting the different histories of the various strains and also possibly indicating different lines in evolution.
Intestinal pathogenic Escherichia coli is an important causative agent of infectious diseases and is responsible for severe diarrhea with distinct manifestations mediated by specific patterns of virulence factors. Some of the best-studied E. coli pathovars are enteropathogenic E. coli (EPEC) strains, which were first described in the late 1940s as the causative agents of infant diarrhea in nurseries (52). EPEC strains are characterized by the induction of attaching-and-effacing (A/E) lesions encompassing the destruction of microvilli of epithelial cells and the intimate adherence of the bacteria to the host cells (19, 26). The destruction of microvilli results in a reduced absorptive capacity of intestinal epithelial cells. In addition, diarrhea is actively enhanced by a breach in the gastrointestinal barrier due to the loosening of tight junctions (46, 51), lysis of mitochondria (38), interruption of the cell cycle (42), induction of apoptosis (16, 17), and redistribution of aquaporin channels (27).
All factors that are responsible for the formation of A/E lesions are encoded on an ∼35-kb chromosomal pathogenicity island (PAI), termed the locus of enterocyte effacement (LEE) (43, 44). Besides EPEC, additional intestinal E. coli pathotypes and other Enterobacteriaceae have been found to harbor the LEE PAI and to induce A/E lesions, including, e.g., atypical EPEC (ATEC), certain Shiga toxin-producing E. coli (STEC), Citrobacter rodentium, Escherichia alvei, and Hafnia alvei (19). Most of these strains are not only pathogenic for humans but also affect other mammals, such as mice, rabbits, and cattle (7, 47).
The LEE PAI was first described in 1995 in the human prototype EPEC strain E2348/69 (O127:H6). The LEE has been found neither in the normal physiological bacterial flora nor in the E. coli strain K-12 derivatives (44). As the G+C content of the LEE (∼38%) differs considerably from that of the E. coli genome (∼50%), the LEE has been acquired by EPEC probably via horizontal gene transfer from a thus-far-unknown ancestor (20, 44). Recent studies demonstrated the spontaneous horizontal transfer of the LEE from a donor strain to a recipient strain, supporting the contribution of horizontal gene transfer in pathogen evolution (65). Like other PAIs, the LEE is generally located at tRNA gene loci that are often used also as insertion sites for bacteriophages. The LEE of the reference strain E2348/69 is integrated at approximately 82 min in the selC tRNA gene. Further tRNA genes containing the LEE are the phenylalanine tRNA genes pheU (at 94 min) and pheV (at 67 min of the E. coli K-12 genome).
The LEE of E2348/69 contains 41 open reading frames (ORFs). Most of the ORFs are organized in one of five polycistronic operons (LEE1 to LEE5) (20, 64). The LEE encodes the Esc (E. coli secretion) factors of the type III secretion system (T3SS), the effector proteins E. coli-secreted proteins (Esps), intimin, Tir (translocated intimin receptor), and Map (mitochondrial associated protein), as well as Sep (secretion of EPEC proteins) factors, which exhibit either T3SS or effector functions. In addition, the Ces (chaperone of E. coli-secreted protein) chaperones and the regulatory proteins Ler (LEE-encoded regulator), GrlR (global regulator of LEE proteins, repressor), and GrlA (global regulator of LEE proteins, activator) are encoded on the LEE. In vitro studies showed that the sole transfer of the LEE to the nonpathogenic E. coli K-12 strain enabled this strain to induce A/E lesions (24, 43).
In recent years, the question of whether atypical enteropathogenic E. coli (ATEC) strains (5), which due to the lack of the EAF plasmid cannot produce bundle-forming pili, are able to cause diarrhea has been discussed controversially as ATEC strains are detected also in healthy volunteers (25). However, ATEC strains were identified as causative agents not only of sporadic diarrhea but also of outbreaks in different countries such as Australia (59), Great Britain (40), Iran (11), Japan (69), Poland (55), and South Africa (22). Furthermore, epidemiological studies showed that in industrialized as well as in developing countries the frequency of diarrhea caused by ATEC increases both absolutely and also relatively compared to EPEC-induced diarrhea (15, 62). It is therefore of major importance to further characterize putative factors contributing to the pathogenicity of ATEC strains and to gain additional insight into plasticity and evolutionary aspects of this emerging pathotype.
In this study, we analyzed a collection of ATEC strains for the presence of several virulence genes, including escV (LEE); bfpB (EAF plasmid); stx1, stx2, invE, elt, estIa, estIb, astA, aggR, pic, and uidA (49); α-hly (α-hemolysin), e-hly (enterohemorrhagic E. coli [EHEC] hemolysin), lifA (efa1) (lymphostatin), and ent (enterotoxin homologue of ShET-2, Shigella enterotoxin 2). lifA (efa1) of EPEC E2348/69 has been described as lymphocyte inhibitory factor A (LifA), which inhibits the mitogen-induced proliferation of peripheral blood lymphocytes and lamina propria mononuclear cells, as well as the synthesis of proinflammatory cytokines (39). The same protein was described in STEC strains as EHEC factor for adherence 1 (Efa1) that is present on the surface of the bacteria and mediates the direct contact to the host cells (3). Ent is a homologue of the Shigella enterotoxin ShET-2. The enterotoxic activity of ShET-2 leads to a significant increase in transepithelial resistance (TER) without causing tissue destruction (12).
In addition, we characterized the LEE PAIs of the two clinical ATEC strains B6 (O26:K60) and 9812 (O128:H2) by comparative sequence analysis and compared their organizations, flanking regions, and chromosomal insertion loci with those of the LEE of other known A/E E. coli and with those of C. rodentium DBS100. Our analysis showed that the core LEE is largely conserved—particularly in those genes encoding the T3SS—whereas genes encoding secreted effector proteins exhibit considerable variability between isolates. This also holds for the insertion sites in the chromosome. In contrast, remarkable differences were found between the 5′- and 3′-LEE-associated flanking regions, possibly indicating a distinct evolutionary descent.
MATERIALS AND METHODS
Bacterial strains and growth conditions.The clinical ATEC isolates B6 (O26:K60) and 9812 (O128:H2) were investigated by comparative analysis of the LEE and its flanking sequences. EPEC E2348/69 (O127:H6), ATEC 3431-4/86 (O8:H−), ATEC 10459 (O55:H−), and ATEC 0181-6/86 (O119:H9:K61) were used for control experiments. Several other clinical isolates were analyzed, according to their genotypic and phenotypic behavior (see Table 3). Bacterial strains were grown in liquid medium or on solid LB medium at 37°C with the appropriate antibiotics (100 μg/ml ampicillin, 50 μg/ml chloramphenicol, 15 μg/ml gentamicin, 50 μg/ml kanamycin, 20 μg/ml nalidixic acid, 50 μg/ml streptomycin, 12 μg/ml tetracycline). Serotyping and the identification of “rough” strains were performed by the National Reference Center for Bacterial Gastroenteritis at the Robert-Koch Institute employing the whole spectrum of typing sera for E. coli O and H antigens (56). Rough strains were detected by standard slide agglutination assay.
Detection of virulence factors and LEE insertion sites by PCR.PCR was performed in 200-μl reaction tubes using a 25-μl reaction mixture consisting of 2 U Taq DNA polymerase and the appropriate buffer (Segenetic, Borken, Germany), 0.3 mM each of deoxynucleotide triphosphates (dNTPs), and 0.4 μM of PCR primers (Table 1). As templates, single bacterial colonies were picked from freshly incubated standard I agar plates and suspended for 1 min in the reaction mixture on ice. PCR-amplified fragments (10 μl) were separated on agarose gels and visualized under UV light after staining with ethidium bromide. The characterization of the LEE insertion sites was performed by PCR as described previously (6, 44, 65). Primer pairs employed in the PCR analysis are listed in Table 2.
PCR primers used for the identification of selected virulence factors and the determination of cosmid sequences
PCR primers for the determination of LEE insertion
Construction of cosmid libraries.For the construction of cosmid libraries, the SuperCos1 cosmid vector kit and Gigapack III Gold packaging extract (Stratagene, Heidelberg, Germany) were used according to the manufacturer's instructions. Briefly, 100 μg genomic DNA was digested with 1 μl (diluted 1:100) Bsp143I (MBI Fermentas, St. Leon-Rot, Germany) for 5 min, purified by phenol chloroform extraction and ethanol precipitation, dephosphorylated by treatment with alkaline phosphatase (calf intestinal alkaline phosphatase) for 1 h, and following a second phenol chloroform extraction and ethanol precipitation step, subsequently suspended in 30 μl of 10 mM trichloroethylene HCl (pH 8.0). The genomic DNA fragments of sizes ranging from 30 to 42 kb were ligated into the prepared SuperCos1 vector, packaged into phages, and transformed into competent E. coli XL1 Blue MR bacteria. After overnight incubation on kanamycin-containing agar plates, the bacteria were rinsed and stored at −70°C.
Analysis of cosmid clones.For the identification of cosmids containing the whole LEE region, colony hybridization assays were performed. A total of 50,000 CFU were plated on kanamycin-containing agar plates (135 mm) and incubated overnight at 37°C. The bacteria were incubated 30 min at 4°C and transferred onto a nylon membrane. The membrane was placed with the colonies upward on filter paper soaked in denaturing buffer (1.5 M NaCl, 0.5 M NaOH) and incubated for 15 min. After an additional 15 min of incubation on filter paper soaked in neutralization buffer (1.5 M NaCl, 0.5 M Tris HCl, pH 7.0), the nylon membrane was exposed to 2× SSC (300 mM NaCl, 30 mM sodium citrate, pH 7.0) for 10 min on soaked filter paper. To cross-link and fix the DNA, the nylon membrane was exposed to UV light (302 nm). Afterwards, remnants of bacterial cells were removed by proteinase K treatment (0.3 mg/ml 2× SSC, 1 h, 37°C) followed by a 10-min wash with 2× SSC under continuous rotation. For hybridization reactions, ECL (enhanced chemiluminescence)-labeled probes using the ECL direct nucleic acid labeling and detection kit (Amersham, Braunschweig, Germany) were applied following the instructions of the manufacturer. To detect LEE-positive clones, DNA probes were amplified by PCR with primers specific for escV (MP-escV-F and MP-escV-R), rorf1 (rorf1-F and rorf1-R), and/or the distal 3′ end of the core LEE (Orf27-F and espF-R) (Table 2). Positive clones were separated and verified by PCR analysis using the primer pairs indicated in Table 1.
Isolation of cosmid DNA and determination of nucleotide sequences.The cosmid DNA-containing LEE-positive inserts were isolated with the Qiagen large construct kit (Qiagen, Hilden, Germany) and verified by sequencing of the insert ends with standard primers specific for T3cos1 and T7cos1. Subsequently, whole cosmids were analyzed by shotgun sequencing performed by AGOWA (Berlin, Germany) and Seqlab (Göttingen, Germany).
Comparative sequence analysis and the generation of phylogenetic trees.The comparative sequence analysis was performed with MacVector (Accelrys, Cambridge, United Kingdom), NCBI BLOW (National Centers for Biotechnology Information [NCBI], Bethesda, MD), and Web act (Imperial College London, London, United Kingdom). The determination of sequence identities and/or similarities was performed with the ClustalW function of the MacVector program. Sequences with nucleotide sequence identities of >90% were designated to be highly homologous. The MacVector program was used for the generation of phylogenetic trees on the basis of the ClustalW analysis. The analysis of the genetic organization of larger DNA regions was accomplished with Web act using standard settings. The determination and analysis of the ORFs of LEE PAIs of the 12 E. coli strains examined, E2348/69 (AF022236), 0181-6/86 (AJ633129), 3431-4/86 (AJ633130), B6 (FM201463), 9812 (FM201464), EDL933 (AE005174), RIMD 0509952 (Sakai; BA000007), RDEC-1 (AF200363), 83/39 (AF453441), 413/89-1 (AJ277443), and RW1374 (AJ303141), as well as C. rodentium DBS100 (AF311901), were performed both with MacVector and with BLAST 2 sequences (NCBI, Bethesda, MD).
RNA isolation and reverse transcription.The isolation of whole RNA from E. coli was performed with the RNeasy minikit (Qiagen, Hilden, Germany) following the instructions of the manufacturer. Subsequently, RNA was treated with DNase of the TurboDNA free kit (Ambion/Applied Biosystems, Darmstadt, Germany) to remove DNA remnants. In order to analyze mRNA production of the bacteria, the purified RNA was converted to RNA-DNA hybrids employing the Revert Aid H minus cDNA synthesis kit (MBI Fermentas, St. Leon-Rot, Germany) using 1.5 μg RNA as the template and random oligohexameric primers.
FAS assay and detection of tyrosine phosphorylation.The adherence pattern of clinical isolates was analyzed by a modification of the method described by Vial et al. (66). Briefly, 70% confluent HeLa cells were infected with a bacterial suspension that had been preincubated for 2 h at 37°C and 10% CO2 (1:20 dilution in Dulbecco's modified Eagle's medium of an overnight culture) followed by 5 min of centrifugation (250 × g) and afterwards incubated for 3 h at 37°C in a 10% CO2 atmosphere. The cells were washed three times with prewarmed Dulbecco's phosphate-buffered saline (D-PBS) containing MgCl2 and CaCl2 to remove nonadherent bacteria and subsequently fixed for 15 min in 4% (wt/vol) paraformaldehyde (PFA). The fixed cells were washed with D-PBS, quenched in 0.2% (wt/vol) glycine for 10 min, and permeabilized with 0.1% (wt/vol) Triton X-100 in D-PBS plus 4% (wt/vol) PFA for 3 min. The cells were blocked with 3% (wt/vol) bovine serum albumin (BSA) in D-PBS for 30 min. For F-actin staining (FAS), phalloidin-Texas Red was used at a 1:100 dilution, and for DNA staining 4′,6-diamidino-2-phenylindole (DAPI) in dimethyl sulfoxide was used as a 1:1,000 dilution. Phosphorylation of tyrosine was detected with specific antisera (1:250; Cell Signaling Technology, Danvers, MA). After three washing steps, the cells were mounted with DABCO-Moviol (Dako, Hamburg, Germany).
Analysis of secreted bacterial proteins.Soluble proteins of the bacterial supernatant were precipitated with trichloroacetic acid at 4°C overnight. Afterwards, the proteins were pelleted by centrifugation at 15,000 × g and 4°C for 15 min. The supernatant was removed, and the remaining protein pellet was covered with ice-cold acetone. Thereafter, the proteins were precipitated by centrifugation (17,000 × g, 5 min, 4°C), dried, suspended in sodium dodecyl sulfate sample buffer, and incubated for 10 min at 100°C. For analysis, the samples were subjected to sodium dodecyl sulfate-polyacrylamide gel electrophoresis and Western blotting using specific antisera for EspB and Tir (18).
Nucleotide sequence accession number.Cosmid nucleotide sequences were submitted to the EMBL nucleotide sequence database: B6-LEE under accession no. FM201463 and LEE-9812 under accession no. FM201464.
RESULTS
Virulence factor matrix of clinical and environmental ATEC isolates.ATEC strains might harbor combinations of virulence factor genes enabling them to cause (severe) diarrhea. Therefore, we investigated a number of ATEC strains from our strain collection regarding their geno- and phenotypical characteristics. The genotypes of the strains were analyzed by multiplex PCR (MPCR) using primers specific for the virulence genes escV (LEE), bfpB (EAF plasmid), stx1, stx2, invE, elt, estIa, estIb, astA, aggR, pic, and uidA (49) and by single PCR experiments using primers specific for α-hly (α-hemolysin), e-hly (EHEC hemolysin), lifA (efa1) (lymphostatin), and ent (enterotoxin homologue of ShET-2, Shigella enterotoxin 2) (Table 3). The putative virulence potential of the ATEC strains was further characterized by their ability to adhere to host cells, to secrete effector proteins into the supernatant, to induce pedestal formation of host cells, and to survive antibiotic treatment.
Geno- and phenotypical characteristics of ATEC strains
All 24 ATEC strains examined in this study exhibited the expected genotype typical for ATEC, as they are escV positive (LEE) but negative for other discriminating virulence genes of intestinal E. coli pathotypes, such as, e.g., bfpB (EAF) (Table 3). Interestingly, our analysis revealed a remarkable concordance of strains belonging to certain LEE tRNA insertion groups (Table 3). Secretion of effector proteins and actin polymerization are shown exemplarily for selected ATEC strains in comparison with the prototype EPEC strain E2348/69 (Fig. 1). Particularly for ATEC strains where the LEE is integrated in the selC locus and which are frequently of serotype O55, secretion of T3SS-dependent effectors appeared to be substantially reduced. This is exemplarily demonstrated for ATEC strain 10459 (O55:H−) (Fig. 1). This is also reflected by the rare manifestation of actin polymerization leading to pedestal formation and Tyr phosphorylation underneath attaching bacteria (Fig. 1; ATEC 10459). However, these ATEC strains often harbor the virulence genes lifA (efa1) and ent. ATEC strains, where the LEE is integrated in the pheU locus, frequently exhibit serotype O26, are FAS test positive, very often resistant to several antibiotics, and are also lifA (efa1) positive. Those strains, where the LEE is integrated in the pheV tRNA gene, are frequently of serotypes O127 or O128, less frequently harbor lifA (efa1) and ent, and usually carry no antibiotic resistance. Apparently, as well, not all of these strains are able to induce actin polymerization in host cells (Table 3). This indicates that the sole presence of the LEE PAI might not be taken as a marker for the pathogenic potential of the particular strain without further knowledge about the regulatory and/or trigger mechanism(s) resulting in sufficient secretion of effector proteins.
Secreted effector proteins and interactions with HeLa cells by FAS assay. (Left) Selected ATEC strains [9812 (O128:H2 pheV); B6 (O26:K60 pheU); 3431-4/86 (O8:H− pheU); and 10459 (O55:H− selC) in comparison with the EPEC prototype strain E2348/69 (O127:H6 selC); have been investigated for T3SS-secreted effector proteins such as Tir and EspB. Lanes M, molecular mass markers. α-EspB, anti-EspB;α-Tir, anti-Tir. (Right) Exemplary actin polymerization/pedestal formation and tyrosine phosphorylation (PY) underneath attaching bacteria. Pedestal formation induced by ATEC strain 10459 (O55:H−) could only rarely be observed.
Sequencing of LEE PAIs derived from two clinical ATEC isolates.To analyze the LEE PAIs of different pathogenic ATEC types in further detail, we cloned and sequenced the LEE of the clinical ATEC strains B6 (O26:K60 pheU) and 9812 (O128:H2 pheV) and compared these sequences to LEE sequences in the database. For cloning, we constructed two cosmid libraries representing the genomes of each strain and isolated LEE-positive cosmid clones using hybridization and PCR assays for the specific detection of the LEE genes eae, escV, escU, rorf1, and espF. Positive clones were verified by sequencing of the insert ends of isolated cosmids. Afterwards the entire sequence of LEE-containing cosmids was determined by shotgun sequencing (AGOWA, Berlin, Germany; SeqLab, Göttingen, Germany). In addition to PCR-amplified DNA products, several cosmids were isolated, characterized, and sequenced to complete the LEE sequences of ATEC B6 and ATEC 9812 (Fig. 2). The sequence of LEE-B6 starts at the pheU locus beginning with cadC, followed by an IS3 homologous element, part of the newly identified gene rorf0 (13), the LEE core region, an IS2 homologue, and the lifA (efa1) region, and ends with the gene yjdC of the pheU locus. For unknown reasons, it was not possible to sequence the 5′ genetic fragment between the IS3 element and the partial rorf0 gene of cosmid CosB6-202. The sequence of LEE-9812 starts in the pheV region beginning with yghD, followed by genes that are homologues to prophage CP4-44, an IS3 element, parts of the rorf0 gene, the LEE core region, adjacent to a genetic fragment that is homologous to the LEE of bovine STEC (BSTEC) RW1374 and rabbit EPEC (REPEC) 84/110-1 (65) and at least one DNA fragment that is homologous to the PAIs of O157:H7 STEC, namely OI-43 and OI-48 of EDL933 and SpLE1 of RIMD 0509952 (Sakai). Also in this case, it was not possible to accomplish the full sequence of LEE-9812, as in several attempts no appropriate cosmid or PCR product could be isolated that contains the 3′-adjacent DNA region of the SpLE1 homologue region. Therefore, we used the sequences of the cosmids Cos9812-003, Cos9812-096, and Cos9812-126 for further analysis (Fig. 2).
Scheme of the sequenced LEE-B6 and LEE-9812. The LEE PAIs of ATEC B6 and ATEC 9812 were sequenced. LEE-B6 is inserted in the pheU locus (cadC, tRNA pheU, and yjdC), whereas LEE-9812 is inserted in the pheV locus (yghD, tRNA pheV). Sequences for the 3′ end of LEE-9812 could not be obtained. Both PAIs are associated with additional genes in their flanking regions such as, e.g., IS3, rorf0 (truncated), IS2, and the lifA (efa1) region for LEE-B6 and for LEE-9812, two regions of CP4-44 prophage genes, IS3, rorf0 (truncated), and a fragment homologous to LEE-RW1374 and LEE-84/110-1, as well as a fragment homologous to PAIs OI-43 and OI-48 of STEC EDL933 and SpLE1 of STEC RIMD 0509952 (Sakai). Cos, cosmid.
Numbering the LEE.The sequences of LEE-B6 (accession no. FM201463) and LEE-9812 (accession no. FM201464) were analyzed by comparison with the LEE sequences of ATEC strains 3431-4/86 (O8:H−) and 0181-6/86 (O119:H9:K61), EPEC E2348/69 (O127:H6), STEC EDL933 (O157:H7), STEC RIMD 0509952 (Sakai) (O157:H7), BSTEC 413/89-1 (O26:nm), BSTEC RW1374 (O103:H2), REPEC RDEC-1 (O15:H−), REPEC 83/39 (O15:H−), and Citrobacter rodentium DBS100. All ORFs, insertion sites, sizes, and G+C contents of all LEE were identified (Table 4). All LEE are integrated in one of the three reported tRNA loci (selC, pheU, or pheV) or, as in the case of the LEE of C. rodentium DBS100, in the ABC transporter cassette located on a plasmid. Concerning the size and the percent G+C content of the LEE, it is obvious that in contrast to the core region of the LEE that is highly conserved (34.4 ± 0.7 kb, 38.4% ± 0.2% G+C), the total inserts harboring the LEE show high variability (between 35.6 kb [E2348/69] and 111.0 kb [RW1374], between a G+C content of 38.4% [E2348/69] and one of 45.0% [RW1374]) (Table 4). The lower G+C content of the LEE core region of about 38.4% in comparison to the G+C content of the total E. coli genome of about 50.8% (10) reflects the origin of the LEE from an unknown ancestor and uptake and distribution by horizontal gene transfer (20, 44). Moreover, the distinct G+C content of the whole LEE inserts that is influenced by the higher G+C content of the flanking regions reflects a remodeling of the flanking regions potentially by recombination events outside the conserved LEE core region subsequent to the original integration of the LEE core.
Integration sites, LEE sizes, GC contents, and intimin types of LEE PAIs of the 12 bacterial strains analyzed in this study
Analysis of the flanking regions of the LEE.Comparative analysis of both newly sequenced LEE of the ATEC strains B6 and 9812 revealed that although their core regions are homologous to the LEE core of the prototype strain EPEC E2348/69, their flanking regions are considerably different and exhibit additional DNA fragments. LEE-B6 is comparable to the LEE of strains RDEC-1, 83/39 and 413/89-1, containing IS sequences at both sides of the core region and the lifA (efa1) region in the 3′-flanking region (Fig. 2 and Fig. 3). The lifA (efa1) region contains the same genes in the same order and orientation as the other strains, including lifA (efa1), ent, nleA, and nleB. In contrast to the other LEE, LEE-B6 contains an additional IS2 element and a separated lifA (efa1) gene: lifA (efa1)-a and lifA (efa1)-b. Like these LEE that are all integrated in the pheU tRNA locus, the LEE-RW1374 (pheV) also contains the lifA (efa1) region, but in this case (an) additional DNA fragment or fragments (∼60 kb) have been inserted (34) (Fig. 3). In addition, also EPEC E2348/69, STEC EDL933, and STEC RIMD 0509952 (Sakai) possess the lifA (efa1) region. However, interestingly, in these cases the lifA (efa1) region appears not to be linked to the LEE, but instead is linked to other PAIs that are integrated in the tRNA pheV locus: PAI OI-122 of EDL933 and SpLE3 of RIMD 0509952 (Sakai). The high homology between these DNA regions, containing lifA (efa1), ent, the integrase gene int, and the genes of IS629, suggests a close relationship and possibly a common origin of these DNA regions. The G+C content of the lifA (efa1) region ranges from 42.9% (83/39) to 44.4% (RW1374) and differs from the G+C content of the E. coli genome (50.8%) and the LEE core region (38.4%), suggesting an additional, third origin. However, as the G+C content varies strongly within the lifA (efa1) region, this genetic region might be assembled from different DNA fragments by frequent recombination events. Particularly the section around the enterotoxin gene ent with a relatively low G+C content of 33.4% supports this assumption.
Schematic illustration of LEE organization. Besides the LEE core region (green), most of the analyzed LEE contain additional genetic elements: prophage genes (pink), IS elements (red), the lifA (efa1) region (blue), rorf0, rorf13, a region homologous in ATEC 9812, REPEC 84/110-1 and BSTEC RW1374 (ARB; hatched portion) and a region homologous to PAIs SpLE1 of STEC RIMD 0509952 (Sakai) and OI-43 and OI-48 of STEC EDL933 (orange).
The flanking regions of LEE-9812 are similar to the LEE of BSTEC RW1374 as they are both integrated in the pheV tRNA gene and exhibit homologous sequences that contain genes of the prophage CP4-44. Differences were attributed to the distant 3′ regions of these LEE. Here, LEE-9812 is highly homologous to the PAI SpLE1 of STEC RIMD 0509952 (Sakai) which is identical to PAIs OI-128 and OI-48 of STEC EDL933 (Fig. 3).
In contrast to the large LEE-9812 and LEE-RW1374 (110 kb), the LEE of EPEC E2348/69 and ATEC 0181-6/86 are quite small and are basically restricted to the core region (Fig. 3). The LEE of the STEC strains RIMD 0509952 (Sakai) and EDL933 contain 3′ of the core region additional prophage 933L genes, suggesting a possible role in the transfer of the PAI. The LEE of ATEC 3431-4/86 contains two additional ORFs in the flanking regions, whereby rorf0, as mentioned above, is also present in other full-length or truncated LEE integrated in either pheU or pheV (Fig. 3).
Start sites of LEE core genes.The comparative sequence analysis of each gene of the LEE core region showed that the start codons and start positions of some genes differ from the respective elements of the corresponding genes of the EPEC reference strain E2348/69. Exemplarily, sequence analysis of rorf1 showed that in ATEC strain 9812 rorf1 exhibits (C to A) an early stop codon due to a point mutation at position 95 leading to a truncated and probably nonfunctional protein. In addition, rorf1 of ATEC strain B6 might also represent a nonfunctional gene, as the potential start codon ATT is located 15 bp behind the equivalent rorf1 start codons of other analyzed LEE. In order to examine the transcription rate of rorf1, total RNAs of strains 9812, B6, E2348/69, 0181-6/86, and 3431-4/86 were isolated, reverse transcribed to cDNA, and analyzed by PCR. In contrast to the positive controls, no PCR amplicons were observed with the template cDNA. This shows that rorf1 of ATEC strains 9812 and B6 is not transcribed under the conditions employed and, in addition, the rorf1 genes of E2348/69, 0181-6/86, and 3431-4/86 might also represent hypothetical, nonfunctional genes.
The sequence analysis of orf3 (cesAB) revealed aberrant start sequences in the analyzed strains. As other strains exhibit no possible start codons at the predicted corresponding orf3 starting point of E2348/69 (start 1) and a potential Shine-Dalgarno sequence overlaps this starting point, it is more likely that the sequence of orf3 begins 3 or 6 bases 3′ of start 1 at starting point 2 or 3.
Comparative sequence analysis of genes of the core region of the LEE.Comparative sequence alignments of the core region genes of several LEE PAIs demonstrated that the LEE of nearly all analyzed strains exhibit corresponding genes in identical order and orientation to the reference EPEC strain E2348/69. Minor exceptions were found in the LEE of the mouse-pathogenic strain C. rodentium DBS100, whose genes rorf1 and rorf2 are not located at the 5′ end, but instead at the 3′ end adjacent to the espF gene. Further differences in LEE-DBS100 were found 3′ of orf11, containing additional genes that are homologous to prophage CP4-6, and between eae and escD, harboring a sequence that exhibits homologies to IS285. Further minor differences between other LEE were restricted to the region between rorf2 (espG) and orf1 (ler). At this position, LEE-RW1374 contains an IS1 element, the LEE of E2348/69, 0181-6/86, EDL933, and RIMD 0509952 (Sakai) exhibit the enterobacterial repetitive intergenic consensus (ERIC) sequence, and LEE-DBS100 contains an IS679 element directly 5′ of ler.
Comparative sequence alignments of core sequences.The comparative sequence analysis of each gene and protein of the LEE core revealed that the LEE of 9812 and B6 are highly homologous to each other (on average, 99.5% identity on the DNA level) and to the LEE of REPEC RDEC-1 (98.8%), REPEC 83/39 (98.8%), BSTEC 413/89-1 (99.4%), and BSTEC RW1374 (96.9%). The LEE of EPEC E2348/69, ATEC 0181-6/86, STEC EDL933, and STEC RIMD 0509952 (Sakai) are not as closely related to the former group of LEE listed above, but in turn share considerable identities among each other. As expected all LEE core regions are highly conserved regarding the order and orientation of the genes, the LEE core sizes (34.4 ± 0.7 kb), and the G+C content (38.4% ± 0.2%). Minor exceptions are observed in the arrangement of genes in the LEE of the mouse pathogen C. rodentium DBS100, where genes rorf1 and rorf2 are located at the 3′ rather than the 5′ end of the core region. Furthermore, the LEE of C. rodentium DBS100 (LEE-DBS100) exhibits additional insertion sequences 3′ of orf11 (grlA) and 3′ of orf22 (eae) that are homologous to the prophage CP4-6 and the IS285 element, respectively.
Functional sequence analysis.Comparative sequence analysis confirmed and extended the high homology of the T3SS constituents and the considerable variability of the effector proteins (data not shown). The highest similarities of all proteins were found among the chaperones CesD (99.4%) and CesT (98.6%) and among the T3SS factors EscR, EscS, EscT, and EscV (98.7% to 98.4%, referenced at the Institute of Infectiology, ZMBE, Münster, Germany). The lowest similarity was identified for EspF (69.4%), which is influenced by the presence of various numbers of proline-rich repeats leading to different sequence lengths (45). sepZ shows only a relatively low sequence identity of 78% on the nucleotide level. This indicates a function as an effector protein rather than as a T3SS factor. This is further underlined by the translocation of the protein into host cells and its accumulation underneath the bacteria (35).
The detailed analysis of the Tir amino acid sequences revealed that overall the sequences of the 12 strains appear more variable. However, the functional domains, especially both transmembrane domains (TM-I and TM-II) and the CesT and intimin binding sides (CB and TIB, respectively) are homologous, reflecting the functionally conserved role of Tir (1, 21, 29). Interestingly, only the STEC strains EDL933 and RIMD 0509952 (Sakai) exhibit valine instead of tyrosine at position 474, while the bovine-pathogenic STEC strains RW1374 and 413/89-1, just like the prototype EPEC strain E2348/69 (4, 37), exhibit a tyrosine residue at this position. In this study, we revealed that all 12 strains examined here exhibit a tyrosine residue (Tyr458) in this homologous region. Therefore, this “alternative” signal transduction pathway, involving TccP (EspFU) as an adaptor protein, might play a more generalized role in the pedestal formation of LEE-positive pathogens.
For EPEC E2348/69, it had been shown that phosphorylation of serine residues Ser434 and Ser463 is necessary for the folding and integration of the Tir protein into the membrane (67). As only the Tir proteins of strains E2348/69, 0181-6/86 and of C. rodentium DBS100 possess serine residues at these positions, these amino acids appear not to play a general role for Tir protein folding. However, further serine residues present in the direct environment might function as potential alternative phosphorylation sites.
DISCUSSION
ATEC strains have been identified not only as the cause of sporadic diarrheal disease but increasingly also in diarrhea outbreaks (5, 36). However, as ATEC strains are found also in healthy volunteers, their pathogenic potential is still under discussion. Therefore, we analyzed the LEE PAIs and selected virulence factors of clinical ATEC isolates in further detail. First, we analyzed all ATEC strains of our strain collection for their geno- and phenotypical characteristics and revealed considerable similarities among groups of strains sharing known specific tRNA LEE insertion sites. Nearly all strains in which the LEE is integrated in the pheU tRNA gene induce actin polymerization in HeLa cells, possess resistance to several antibiotics, and harbor the virulence genes lifA (efa1) and ent. This might reflect an increased pathogenic potential inherent in these strains. The strains of the other tRNA insertion groups also induce actin polymerization in infected cells, but only if they are able to adhere to host cells and secrete effector proteins into the supernatant. Interestingly, most of the selC O55 strains examined in this study only rarely induced actin polymerization and pedestal formation under the conditions applied. This is exemplarily shown for ATEC strain 10459 (O55:H− selC) (Fig. 1).
Moreover, the pathogenicity of most of the pheU strains might be further enhanced by hemolysin (α-hly or EHEC hly), which is absent in most strains of the other groups. Only a few strains that carry a LEE integrated in the selC or pheV tRNA gene locus and express the O55 or the O127/O128 serotypes are able to induce actin polymerization in host cells, possess antibiotic resistance, or harbor additional virulence factors. Moreover, a different genetic background might also provide additional, yet unknown factors that might contribute to the process of A/E lesion formation and pathogenicity. The finding that only those selC and pheV strains that are able to induce actin polymerization both adhere to host cells and secrete LEE-encoded effector proteins into the supernatant favors this assumption. As the transformation of the LEE-E2348/69 or of LEE-0181-6/86 into the nonpathogenic E. coli K-12 strain enabled these strains to induce A/E lesions (24, 43), any LEE ought to encode the necessary factors for full virulence and, likewise, every LEE-containing strain should be able to induce A/E lesions. Possibly, LEE-positive strains require specific environmental conditions to produce virulence factors and pedestal formation. This is further underlined by the number of putative T3SS proteins identified in the recently sequenced genomes of EPEC strains that ranged from 21 in the E2348/69 genome (33) to 33 in the EPEC strain B171-8 (54) to over 50 in the EHEC O157 strain (33).
ATEC strains are found in a wide range of habitats and hosts. In previous studies, it was shown that animal-pathogenic ATEC strains cause A/E lesions only in those cell lines that were isolated from the appropriate host but not in human-derived HeLa cells (53, 68). Therefore, ATEC strains might possibly be adapted to certain hosts and/or habitats and are not ubiquitously virulent and/or able to induce pedestal formation under nonpermissive environmental conditions. ATEC strains, which do not cause A/E lesions in humans, frequently evade the immune system and are able to survive in the host (8). However, such commensal ATEC strains might also represent a potential threat, since new pathogenic variants can result by the transmission of the LEE to other bacterial strains. Therefore, the sole presence of a LEE as identified by the presence of marker genes such as the escV gene might not be sufficient to draw a conclusion about the pathogenic potential of the particular strain without additional knowledge of the regulatory or trigger mechanism(s) to induce sufficient secretion of effector proteins. Consequently, in a LEE-harboring strain, the putative detection of additional virulence factors should be included in the assessment of its pathogenic potential (47).
To analyze the plasticity of ATEC strains in further detail, we sequenced the LEE PAIs of ATEC B6 (O26:K60 pheU) and ATEC 9812 (O128:H2 pheV) and compared them to the LEE of EPEC E2348/69, ATEC 0181-6/86, ATEC 3431-4/86, STEC EDL933, STEC RIMD 0509952 (Sakai), REPEC RDEC-1, REPEC 83/39, BSTEC RW1374, BSTEC 413/89-1, and C. rodentium DBS100.
Our results reflect the requirements of a highly optimized and also strictly regulated secretion system compared to the required adaptation of secreted virulence factors to the respective host cells. Based on these analyses, the LEE investigated in this study can be arranged into two major phylogenetic lineages. This is supported by the general characteristics of the LEE, as the LEE of group E2348/69 are all inserted in selC and exhibit comparable LEE sizes, G+C contents, and intimin types (Table 4). This is in contrast to the LEE of the other group that are all inserted near pheU or pheV, exhibit mainly intimin β, and are often associated with veterinary isolates. The LEE of ATEC 3431-4/86 and of C. rodentium DBS100 exhibit the lowest sequence identity with the other LEE and appear to be more distantly related. The proposed relationship is underlined and illustrated by phylogenetic trees that were generated on the basis of sequence alignments. Both the phylogenetic trees of the strongly conserved genes of the T3SS and the phylogenetic trees of the less-conserved genes, like the ones encoding the effector proteins, are very similar among themselves and reflect the relatedness of the LEE, as underlined in this study. Exemplarily, the phylogenetic tree of the escV gene is represented in Fig. 5. Furthermore, this analysis shows that ATEC strains apparently do not form a separate group, but instead can be arranged in different phylogenetic lineages. Recent evidence suggests that ATEC strains may be more closely related to and might indeed represent EHEC strains that have lost the Shiga toxin-encoding phage (9).
As expected, all LEE core regions are highly conserved regarding the order and orientation of the genes, the LEE core sizes (34.4 ± 0.7 kb), and the G+C content (38.4% ± 0.2%). Minor exceptions are observed in the arrangement of genes in the LEE of the mouse pathogen C. rodentium DBS100, suggesting that LEE-DBS100 might descend from a different ancestor from the LEE of the other strains investigated in this study.
The region with the highest sequence divergence is between rorf2 (espG) and orf1 (ler). At this position, the insertion sequence IS1 is inserted in the LEE of RW1374, the ERIC sequence is inserted in the LEE of E2348/69, 0181-6/86, EDL933, and of the STEC RIMD 0509952 (Sakai). Furthermore, at the analogous position, the LEE of DBS100 contains an IS679 element. It is therefore feasible that these additional DNA fragments including the ERIC sequences (32, 63) might influence the transcription activity of the surrounding genes such as the transcription rate of ler and espG (2) and in this way might contribute to the adaptation of the strains.
Comparison of the Tir amino acid sequences showed that the Tir proteins of ATEC B6 and ATEC 9812 possess a tyrosine residue at position 474 (based on the E2348/69 Tir amino acid sequence), just like that of the prototype EPEC strain E2348/69. Therefore, actin polymerization and pedestal formation by ATEC follow the mechanism elucidated for the prototype EPEC E2348/69 via the recruitment of the adapter protein Nck following phosphorylation of Tyr474 (37). Except for the STEC strains, all other analyzed strains—interestingly including the bovine-pathogenic STEC strains RW1374 and 413/89-1—exhibit Tyr474, which points to a similar actin polymerization pathway. Interestingly, all 12 Tir amino acid sequences are highly conserved with respect to tyrosine residue 454 and the neighboring amino acids that were shown to be responsible for the recruitment of the alternative bacterial adapter protein TccP (EspFU) and the induction of the actin polymerization by STEC strains (14). Since this region is highly conserved and TccP (EspFU) (or a homologue) is present in a number of LEE-positive A/E pathogens (23), this “alternative” signal transduction pathway might play an important role in the Tir-mediated actin polymerization.
Further sequence analysis revealed that, except for strain 0181-6/86, the serine residues Ser434 and Ser463 of the Tir amino acid sequence of E2348/69 are missing at the analogous positions in the other strains. Therefore, these serine residues that are described to be essential for the folding of Tir and the integration into the host cell membrane (67) do not function in a general mechanism. Furthermore, the contribution of these serine residues to protein folding was not confirmed in in vitro experiments (57).
The comparative sequence analysis of the flanking regions revealed noticeable differences. LEE-2348/69 and LEE-0181-6/86 are mainly restricted to the LEE core region, whereas the other LEE contained additional genetic elements, like the LEE of the STEC strains that contain the prophage genes 933L in the 3′-flanking region. The 5′-flanking regions of 3431-4/86 and RW1374 contain the newly identified gene rorf0 (ibe) (13) that is also present in part in B6, 9812, RDEC-1, 83/39, and 413/89-1. This gives a first hint that these strains might be more closely related to each other. The flanking regions of LEE-B6 resemble the flanking regions of the strains 83/39, RDEC-1, and 413/89-1. These 3′-flanking regions contain the lifA (efa1) region, which possess the lymphostatin gene lifA (efa1) and the hypothetical enterotoxin gene ent. This lifA (efa1) region is also present in the far 3′-flanking region of LEE-RW1374 and in the pheV region of strains E2348/69, EDL933 (OI-122), and RIMD 0509952 (Sakai) (SpLE3), separated from the LEE. This might be explained by recombination events between the LEE and the lifA (efa1) region containing PAIs, whereas the lifA (efa1) region was joined to the LEE or vice versa (48). Due to the differences in G+C content of the LEE (38.4%) and the lifA (efa1) regions (∼43%), simultaneous integration of the LEE and the lifA (efa1) region as one genetic element appears rather unlikely.
In summary, these analyses demonstrated that the LEE PAI undergoes frequent alterations, whereby the core regions are relatively stable and the flanking regions are characterized by considerable differences. Previous studies showed that the evolution of PAIs is accelerated by various genetic events, such as, e.g., recombination between insertion sequences, direct repeats, or other homologous sequences (28, 30, 61). These events might also be involved in the evolution of the LEE, as the frequently found IS elements and prophage genes of the analyzed LEE point to a high recombination frequency. Conspicuously, prophages of the P4 family, e.g., the CP4-44 genes of LEE-9812 and LEE-RW1374 or the 933L prophage genes of LEE-EDL933 and LEE-Sakai, are often associated with certain LEE. Since phages integrate their DNA frequently into tRNA genes and LEE are mostly located in tRNA genes, bacteriophages of the P4 type might play a crucial role in LEE translocation and evolution. Apparently, P4 phages prefer particular tRNA genes as insertion sites, as hitherto LEE have been found in selC, pheU, and pheV loci. Further P4-like containing PAIs, such as, e.g., OI-43, OI-48, and SpLE1, as well as the Shigella resistance loci (SRL) of Shigella (41), are located in serine tRNA genes. This makes recombination events between these PAIs, including the LEE, very likely and might explain the association of the LEE with the lifA (efa1) and the 933L regions.
Due to the different insertion sites in certain E. coli strains of different serotypes, it was assumed that the LEE was possibly taken up several times from an unknown origin during evolution (6, 58, 60). However, the unexpected high level of identity (on average, 99.5% identity at the DNA level) between the LEE core regions of B6 (O26:K60) and 9812 (O128:H2) that are integrated in different tRNA loci clearly points to LEE transfer between different strains which had also been suggested previously (e.g., references 28, 31, 50, 60, and 65). The high homology of the LEE core regions of the 11 analyzed E. coli strains points to a common “LEE parentage pool,” whereby the LEE were either integrated directly into different insertion sides or by horizontal gene transfer and subsequent remodeling. This can be exemplified by a hypothetical phylogenetic tree generated by ClustalW analysis which is based on the sequence of the escV genes of the 12 LEE examined in this study (Fig. 4). The LEE of the C. rodentium mouse-pathogenic strain DBS100 probably descended from another “LEE parentage pool,” as it is quite different from the E. coli LEE. It is therefore tempting to propose a (hypothetical) model of possible linkages between the analyzed strains (Fig. 5) that corresponds nicely to the hypothetical phylogenetic tree based on escV sequences depicted in Fig. 4.
Hypothetical phylogenetic tree of LEE pathogens based on the sequence analysis of escV. On the basis of comparative sequence analysis of the LEE gene escV, a phylogenetic tree of LEE pathogens was generated using ClustalW of MacVector. Branch lengths approximate the number of substitutions per site.
Hypothetical model of evolutionary descent of LEE. Possibly, all analyzed LEE emerged from a common hypothetical LEE parental pool and developed differently afterwards. The LEE of C. rodentium DBS100 exhibits the largest difference compared to the other PAIs and appeared to have separated very early. Alternatively, LEE-DBS100 might also be derived from another “ancestral LEE” (as indicated by the dashed line). Since the LEE of group 1 do not contain rorf0, but do contain ERIC sequences, these strains very likely represent a separate phylogenetic lineage. EPEC E2348/69 and ATEC 0181-6/86 harbor—in contrast to the STEC EDL933 and STEC RIMD 0509952 (Sakai) strains—no 933L prophage. All other analyzed LEE contain in their 5′-flanking region a complete or partial sequence of rorf0 and are therefore more closely related to each other. However, a direct descent of another ancestral LEE cannot be excluded (dashed lines). The LEE core regions of group 2 are nearly identical to each other, making a very close relationship very likely. Possibly, the original ancestral LEE was transferred through horizontal gene transfer (HGT) to the parental strain of BSTEC RW1374. Thereupon, a recombination (rec.) event between the LEE-RW1374 and a genomic island, similar to STEC SpLE1, took place and LEE-9812 emerged. Possibly, the core region of LEE-9812 was transferred through horizontal gene transfer by IS elements directly to ATEC B6 into the lifA (efa1) region.
The comparative sequence analysis of the 12 LEE revealed that the four ATEC strains represent no separate phylogenetic group, but—due to the remarkable correspondence and homology—are allocated with certain strains to appointed phylogenetic groups. The classification was based on the intimin type and the comparative analyses of the flanking regions of the LEE, insertion sites, and LEE core regions. Therefore, the LEE of the strains EPEC E2348/69, ATEC 0181-6/86, STEC EDL933, and STEC RIMD 0509952 (Sakai) and, respectively, the LEE of strains ATEC B6, ATEC 9812, REPEC RDEC-1, REPEC 83/39, BSTEC 413/89-1, and BSTEC RW1374 should be arranged into two phylogenetic groups. The identities between the LEE pairs of the strains EPEC E2348/69 and ATEC 0181-6/86, STEC EDL933 and STEC RIMD 0509952 (Sakai), ATEC B6 and ATEC 9812, and REPEC RDEC-1 and REPEC 83/39 are particularly remarkable. The LEE of strains ATEC 3431-4/86 and C. rodentium DBS100 represent obviously separate lineages.
ACKNOWLEDGMENTS
This study was supported by grants from the EU Network ERA-NET PathoGenoMics (no. PTJ-BIO/0313937C), from the Competence Network PathoGenoMik (PTJ-BIO/03U213BVBIIIPG3), and from the German Research Foundation (DFG, SFB293 B5).
We are indebted to A. Fruth and H. Tschäpe (RKI Wernigerode) for serotyping of the ATEC strains investigated in this study, to F. Ebel (Munich) for the gift of antiserum (18), and to C. Beinke for selC primers.
FOOTNOTES
- Received 23 January 2009.
- Returned for modification 8 March 2009.
- Accepted 26 May 2009.
↵▿ Published ahead of print on 8 June 2009.
Editor: J. B. Bliska
- American Society for Microbiology