Previous Article | Next Article ![]()
Infection and Immunity, January 2003, p. 187-195, Vol. 71, No. 1
0019-9567/03/$08.00+0 DOI: 10.1128/IAI.71.1.187-195.2003
Copyright © 2003, American Society for Microbiology. All Rights Reserved.
Department of Diagnostic Medicine/Pathobiology, College of Veterinary Medicine, Kansas State University, Manhattan, Kansas 66506,1 Centers for Disease Control and Prevention, Atlanta, Georgia 303332
Received 4 September 2002/ Returned for modification 8 October 2002/ Accepted 16 October 2002
|
|
|---|
|
|
|---|
A multigene locus that encodes a 28-kDa outer membrane protein(s) (OMP) has been reported from E. chaffeensis, E. canis, and E. ruminantium (15, 18-21, 24-26, 34). The protein-coding sequences of the genes have four long stretches of conserved regions separated by three highly variable regions (VRs) where the dominant immunogenic B-cell epitopes are located (25). This locus has generated considerable interest for its possible role in immune evasion because the 28-kDa OMP locus shares structural similarity to antigenic variant surface antigen genes of Borrelia burgdorferi and Neisseria gonorrhoeae (12, 28, 36).
The protein-coding region of the gene encoding a 120-kDa OMP contains two to four nearly identical, highly hydrophilic 80-amino-acid tandem repeats (30, 35). The number of repeats varies among different isolates, resulting in the size variations in the encoded protein. Similarly, within the coding region of the variable-length PCR target (VLPT) gene there is a variable number of direct nucleotide repeats that may code for various numbers of 30-amino-acid repeats (23, 31). The presence of variable direct repeats in E. chaffeensis is similar to that of the major antigenic variant surface protein of Mycoplasma hominis (37). M. hominis surface protein, termed variable adherence-associated antigen, contains one to four nearly identical repeats of 121 amino acids, and the gain or loss of repeats gives rise to distinct antigenic variants with size variations in variable adherence-associated antigen in clonal populations (37).
In this study we mapped E. chaffeensis isolates to examine variability in the genome. Specifically, the 28-kDa gene locus spanning 53 kb of DNA from 10 human isolates was characterized at the molecular level. We also compared the sequence data generated from 15 kb of the 120-kDa OMP gene and 4 kb of the VLPT gene from all 10 isolates.
|
|
|---|
|
View this table: [in a new window] |
TABLE 1. E. chaffeensis isolates
|
7-kb fragment spanning genes 14 through 19 of the Arkansas isolate generated by PCR (described in the next paragraph) was used as the probe for mapping the 28-kDa locus. Similarly, amplicons generated from the Arkansas isolate for 120-kDa and VLPT genes by using primers described in the next paragraph were used to make hybridization probes. A 0.39-kb rRNA gene segment amplified from the Arkansas isolate as described previously (11) was used to map the rRNA gene locus. Hybridization was performed overnight at 68°C. The blots were washed once each for 30 min at 68°C with 6x SSC (1x SSC is 0.15 M NaCl plus 0.015 M sodium citrate) containing 0.1% SDS, 2x SSC containing 0.1% SDS, 1x SSC containing 0.1% SDS, and 0.2x SSC containing 0.1% SDS, respectively. After a final stringent wash, membranes were exposed for 2 days to X-ray film at -70°C with an intensifying screen, and the film was developed in a Kodak film processor.
Gene analysis by PCR and nucleotide sequence.
A 28-kDa OMP locus-specific primer pair was designed on the basis of the published sequence for the Arkansas isolate (RRG14, 5'CCGTTTTCTGCTTTATTAGAATG; and RRG18, 5'RNCTAATAATTACAATGTGTG). The primer pair was used to amplify
7-kb fragments spanning the genes 14 through 19 (34). PCR was performed by using a long PCR reagent kit (Gibcol BRL, Rockville, Md.). High-fidelity DNA polymerase included in the PCR kit minimizes errors in amplified products. The PCR cycles, performed in a GenAmp9700 (Applied Biosystems, Foster City, Calif.), included one initial cycle at 94°C for 4 min followed by 35 cycles of 94°C for 30 s, 55°C for 30 s, and 68°C for 7 min and one extension cycle at 72°C for 15 min. Ten percent PCR products each were resolved on a 0.9% agarose gel and were detected by ethidium bromide staining. The primer pair for amplifying the 120-kDa gene was F1 (5'GAGAATTGATTGTGGAGTTGG), reported by Yu et al. (35), and RRG24 (5'CTATCTCAAGACTAAACCTTAC). RRG24 is a downstream, reverse primer designed from the sequence reported by Yu et al. (35). VLPT gene-specific PCR primers FB5 (5'AAATAGGGTATAAATATGTCAC) and FB3 (5'GCCTAATTCAGATAAACTAAC) were used as described earlier (31).
PCR products of 28-kDa, 120-kDa OMP, and VLPT gene segments were purified by use of a PCR product presequencing kit (USB Corp., Cleveland, Ohio) and were used for sequence determination by primer walking with the manual Thermo Sequenase Radiolabeled Terminator Cycle Sequencing kit (USB Corp.). Sequence analysis was performed by using the GCG program package (7) available on the Kansas State University Unix computer system. The sequence data were deposited in the GenBank database.
Detection of RNA from the 28-kDa OMP gene locus by RT-PCR. Gene-specific primers, designed from the VRs within each gene for genes 14 to 19, were used in reverse transcription (RT)-PCR analysis (Table 2). Total RNA was extracted from E. chaffeensis cultures by use of the RNAwiz total RNA isolation kit (Ambion Inc., Austin, Tex.). RNA samples were stored at -70°C until use. Total RNA was treated with RQ1 DNase (Promega Corp., Madison, Wis.) to eliminate genomic DNA prior to use in RT-PCR assays. To increase the activity, the DNase treatment was performed for 1 h at 37°C in buffer provided by the vendor. In addition, 1 mM CaCl2 and 1.5 mM MgSO4 were added. The presence of gene-specific RT-PCR products was verified after transferring the products to a nylon membrane followed by hybridization with gene-specific probes.
|
View this table: [in a new window] |
TABLE 2. Primers used for RT-PCR analysis of 28-kDa OMP genes
|
Nucleotide sequence accession numbers. Sequences reported in this paper were deposited in the GenBank database under numbers AF479833 to AF479840, AF474890 to AF474899, AF470688 to AF470697, AY117396, and AY117397.
|
|
|---|
![]() View larger version (65K): [in a new window] |
FIG. 1. DNA filter hybridization of E. chaffeensis. Ten isolates from HME patients were used: Osceola, lane 1; Arkansas, lane 2; Lithonia, lane 3; Chattanooga, lane 4; West Paces, lane 5; Heartland, lane 6; St. Vincent, lane 7; Wakulla, lane 8; Liberty, lane 9; Jax, lane 10. DNA from each isolate was digested to completion, resolved on a 0.9% agarose gel, transferred to a nylon membrane, and hybridized with 32P-labeled probes specific to the 28-kDa gene locus (A); 120-kDa gene (B); VLPT gene (C); or ribosomal RNA gene (D). Restriction enzymes used for the DNA blot analysis are identified on the top of each blot. Molecular size markers are present on the left side of the profiles.
|
Approximately 7-kb-long DNA segments spanning six genes of the 28-kDa OMP locus (genes 14 to 19 of E. chaffeensis Arkansas isolate) were amplified from DNA of all 10 E. chaffeensis isolates (Fig. 2). Predicted-size PCR products were detected in Group I and III isolates, while the amplicons for the Group II isolates, including the Wakulla isolate, were about 1.2 kb smaller. The reduction in the size of Group II PCR fragments was approximately equal to the loss of one gene. The entire protein-coding sequence of the 120-kDa gene and partial segments of the VLPT gene were also amplified. Unlike the DNA blot analysis, amplification of these two gene segments showed notable size differences (Fig. 2).
![]() View larger version (52K): [in a new window] |
FIG. 2. PCR analysis of three gene loci of E. chaffeensis. (A) The 28-kDa OMP loci spanning genes 14 through 19 were amplified, and the products were resolved on a 0.9% agarose gel and detected after being stained with ethidium bromide. Lanes 1 to 10 contained PCR products generated from Osciola, Arkansas, Lithonia, Chattanooga, Wakulla, West Paces, Heartland, St. Vincent, Liberty, and Jax isolates, respectively. Panels B and C are the same as panel A except that the PCR primers used for amplification were specific to the 120-kDa gene and VLPT gene segments, respectively.
|
![]() View larger version (28K): [in a new window] |
FIG. 3. Comparison of nucleotide sequences of the 28-kDa gene amplicons of all three genetic group isolates. Sequences for genes 14 to 19 of the amplified 28-kDa OMP locus segments were generated by Sanger's manual dideoxy chain termination method with synthetic overlapping primers. The entire gene locus spanning all 22 genes present in the genome for the Arkansas isolate is presented at the top of the figure for comparison. Arrowheads indicate the orientation of the protein coding sequence. Protein coding sequences are identified with black boxes, and variable regions within the coding regions are highlighted by gray boxes.
|
![]() View larger version (19K): [in a new window] |
FIG. 4. Phylogenetic tree of the 28-kDa OMP locus spanning orthologous and paralogous genes 14 to 19. Protein-coding sequences that encode 28-kDa OMPs from all genes depicted in Fig. 3 were translated and used to construct the phylogenetic tree. Sequences from one isolate each (Arkansas, Jax, and St. Vincent) were used to prepare the tree with the GCG programs Pileup, Distance, and Growtree (7). An OMP, p44-13, a sequence of Anaplasma phagocytophila (GenBank accession no. AF414592), was used as an outgroup to root the tree. Orthologous genes 15, 15a, and 17 clustered on the same branch of the tree, while genes 14, 16, 18, and 19 grouped on separate branches. Symbols I, II, and III in the figure represent genes from isolates of Groups I, II, and III, respectively.
|
To verify the organization of the mapped region, specifically the position of each gene in the locus and the loss or gain of genes among isolates, PCR analysis of several overlapping fragments was performed (Fig. 5). PCR primers were designed from the variable regions within the protein-coding sequences of the 28-kDa gene locus (Table 2), and the sizes of the expected amplicons were estimated (Fig. 5A). The sizes of the amplicons generated by using these primer sets were evaluated (Fig. 5B). The amplicons generated by using these primer sets were of the same sizes as expected, thus verifying the position of each gene on the mapped locus.
![]() View larger version (48K): [in a new window] |
FIG. 5. Verification of the mapped 28-kDa locus from Group II and III isolates. To verify the organization of genes, PCR primers were designed to amplify predicted size overlapping fragments. (A) The primers were designed from the variable regions of coding sequences, and the sizes of the predicted amplicons were estimated. Segments A and B verify the location of genes 14 and 15 for Group II isolates. Segments C and G verify the loss of gene 18 in Group II and III isolates. The gain of gene 15a in Group III isolates was verified by amplifying segments D, E, and F. (B) Amplification data for two isolates each (Group II, Wakulla and St. Vincent; Group III, Liberty and Jax) are presented. The predicted amplicons were detected in the PCR products.
|
![]() View larger version (18K): [in a new window] |
FIG. 6. Schematic representation of the nucleotide sequence data generated from the 120-kDa OMP and VLPT gene loci. Lines represent unique sequences at both 5' and 3' ends of the amplified segments. Boxes represent tandem repeats. The repeat sizes in 120-kDa and VLPT genes are 240 and 90 bp each, respectively. The number of repeats vary in different isolates which is represented by the varying numbers of boxes.
|
![]() View larger version (41K): [in a new window] |
FIG. 7. RT-PCR analysis of E. chaffeensis isolate Arkansas RNA extracted from in vitro-cultured organisms. Purified RNA was digested with RQ1 DNase for 1 h to remove any residual genomic DNA. DNA-free RNA was used for the RT-PCR assay by using gene-specific primer sets for genes 14 to 19. The primers were designed from variable regions and are listed in Table 2. Lane N contained all reagents for the RT-PCR assay, except RNA. Lane D is the same as lane N plus genomic DNA. Lane R is same as lane N but with DNase-treated RNA as the template. Lane C is same as lane R except that the reverse transcriptase is omitted. (A) Amplified products resolved on a 1.2% agarose gel and detected after being stained with ethidium bromide. (B) RT-PCR products transferred to a nylon membrane and hybridized with a 32P-labeled 28-kDa gene probe. St. Vincent, Wakulla, and Liberty isolates were similarly analyzed.
|
|
View this table: [in a new window] |
TABLE 3. Expression of the 28-kDa OMP analyzed by RT-PCR
|
![]() View larger version (48K): [in a new window] |
FIG. 8. Immunoblot analysis. Purified recombinant proteins for genes 16 and 19 of the Arkansas isolate, a recombinant antigen of a homologous gene from E. canis (ORF1) (24), and a whole-cell E. chaffeensis Arkansas isolate protein fraction were tested by Western blot analysis by using serum obtained from a B6 mouse immunized with live E. chaffeensis Arkansas isolate. (A) Coomassie blue stained gel; (B) corresponding immunoblot. Lanes: M, protein markers; E.ca, E. canis recombinant antigen; 16 and 19, the recombinant antigens made from E. chaffeensis isolate Arkansas 28-kDa OMP genes 16 and 19, respectively; E.ch, whole-cell antigen of E. chaffeensis isolate Arkansas.
|
|
|
|---|
We and other researchers have reported that the protein coding sequence of the genes in the 28-kDa OMP locus are divided into three highly variable regions that are separated by four highly conserved regions (15, 18-21, 24-26, 34). The variable regions constitute hydrophilic regions and may represent B-cell epitopes (25). In this study, we observed a similar gene structure having conserved and variable regions in the protein-coding sequences of paralogous genes of all 10 isolates. Variation in the paralogous genes was higher among each isolate compared to that found among orthologous genes from different isolates. These observations suggest that the paralogous genes in the genome may have generated by gene duplication, but since then they may have evolved independently from each other.
The loss or gain of two genes within the 28-kDa OMP locus is a novel observation that, to our knowledge, we report here for the first time. This finding supports the hypothesis that the locus undergoes major rearrangements, such as gene duplication or elimination, resulting in the loss or gain of a complete gene(s). The presence of gene 18 only in Group I isolates and gene 15a only in Group III isolates is evidence for this type of genetic change. Size variations in the 120-kDa and VLPT genes among different isolates also suggest that the insertion or deletion mutations of large pieces of DNA is a common occurrence in this bacterium. Variation in the number of repeats in these two loci may also have resulted from slippage mutations. The molecular basis and functional significance of the observed genomic changes resulting from the gene deletion or insertions and variable number of repeats within protein coding sequences remains to be investigated.
Our RT-PCR analysis of total RNA isolated from cultured E. chaffeensis isolates revealed differences in the transcriptional activity of the locus. Expressed transcripts remained constant for two older isolates (Arkansas and St. Vincent) (4, 23) but are variable with the source of RNA for two relatively new isolates (Wakulla and Liberty) (31). To examine if mRNA expression coincides with protein synthesis, we also analyzed antigen expression by Western blot analysis. There was a perfect agreement between the transcribed and nontranscribed genes for the two genes tested in this study.
RT-PCR analysis of the paralogs of the 28-kDa genes of E. chaffeensis also were reported recently for the Arkansas isolate (15, 34). A locus highly homologous to the E. chaffeensis 28-kDa OMP locus from E. canis also was examined for gene expression by RT-PCR (19). According to Long et al. (15), RT-PCR analysis of RNA obtained from in vitro-cultured E. chaffeensis suggests the expression of 16 of the 22 genes. Ohasi et al. (19) reported the expression of all 22 genes from the E. canis locus. Our study on the transcriptional analysis of RNA spanning seven 3'-end clusters of OMP genes of in vitro-cultured E. canis (referred to as an
region [19]) revealed the expression of only three of seven genes (Ganta and Cheng, unpublished results). Only one OMP gene transcript of E. canis grown in a tick host is reported, and a higher level of expression of the same gene transcript is also reported for E. canis cultured at 25°C (32). Our present study analyzed RNA for the six tandemly arranged, paralogous genes (
region of the locus [19]) of a Group I isolate, Arkansas, Group II isolates St. Vincent and Wakulla, and a Group III isolate, Liberty. Our data were identical to those reported by Long et al. (15) for genes 15, 17, 18, and 19 of the Arkansas isolate. While we observed expression of gene 14 and no expression of gene 16, Long et al. (15) reported the opposite. Similarly, our unpublished RT-PCR data for E. canis differed from those reported by Ohashi et al. (19). Our data in the present study for Wakulla and Liberty isolates are variable in different preparations of RNA. Despite the presence of 22 genes in the locus, and independent of the variations in the identified expressed genes reported by different laboratories, it is noteworthy that multiple genes are transcriptionally active. It is not clear why the expressed transcripts vary. There may be multiple reasons for the observed differences in the transcribed genes. Two possible reasons are (i) RNA isolated from cultured organisms may represent RNA derived from bacteria in a nonsynchronized state, i.e., obtained from a mix of different life cycle stages, and (ii) expression may be influenced by the culture conditions. The variability in gene expression at the transcript level raises questions about using RT-PCR data as a measure for examining and interpreting data for translation of the gene products.
Interestingly, despite differences in the RNA expression and the documentation of multiple transcripts, the detectable protein made from the 28-kDa OMP locus reported in the literature remained constant (15, 21). On the basis of the N-terminal amino acid sequence of expressed proteins from cultured E. chaffeensis, Ohashi et al. (21) and Long et al. (15) identified only one protein, i.e., the product of gene 19. Our present study also supports the expression of this protein. These observations suggest that while the data on the transcriptional activity are conflicting, there is consistency in the translated protein encoded from the 28-kDa OMP locus. More importantly, there is no evidence that supports the translation of more than one gene.
Long et al. (15) suggested that several genes may be transcribed but that not all genes are translated. An alternative explanation for the observed presence of multiple transcripts made from the 28-kDa locus is that they represent transcripts made for overlapping genes that may be present within the 28-kDa OMP gene locus. In overlapping genes, parts of the same DNA region can encode for more than one protein but can use different reading frames (27). Sharing the same genomic region to encode for two or more proteins is commonly reported for viruses, bacteria, and protozoans (1, 13, 29). To examine if overlapping genes may exist in E. chaffeensis, predicted amino acid sequences having open reading frames longer than 20 amino acids in the remaining two forward frames and the three reverse frames were identified for genes 14 to 19 of the Arkansas isolate 28-kDa locus and were subjected to BLAST homology search. Predicted amino acid sequences of several ORFs exhibited significant similarities (>50% identity) with known sequences. A few examples are listed in Fig. 9. These observations raise the possibility of the presence of overlapping genes in the 28-kDa OMP locus. The presence of transcripts as judged by positive RT-PCR products for several nontranslating 28-kDa genes, therefore, may represent transcripts derived from the possible overlapping genes located within the 28-kDa OMP locus.
![]() View larger version (50K): [in a new window] |
FIG. 9. GenBank search analysis of long ORFs other than 28-kDa OMP coding sequences. Predicted amino acid sequences of ORFs longer than 20 amino acids from forward frames 2 and 3 and the three reverse frames were subjected to a homology search with sequences in the databases. ORFs having significant sequence similarities (>50% identity) to sequence motifs of known sequences available in the GenBank database are identified. One example each from genes 16, 17, 18, and 19 are shown.
|
At this time, it is unclear why E. chaffeensis has several tandemly arranged genes having only one expressed antigen. The significance of the variable number of repeats in 120-kDa and VLPT genes among isolates is also unclear. The presence of a multigene locus and the variable multirepeat sequences in E. chaffeensis are similar to some outer surface protein gene loci that have been shown to play important roles in immune evasion in the genomes of the pathogenic bacteria B. burgdorferi, N. gonorrhoeae, and M. hominis (12, 28, 36, 37). The functional significance of these gene loci in E. chaffeensis, therefore, requires the analysis of organisms recovered over time from a reservoir host that is infected by tick bite.
In conclusion, we identified three distinct genetic groups of E. chaffeensis as judged by the analysis of the 28-kDa OMP locus. Novel gene deletion or insertion mutations resulting in the loss or gain of genes were identified in this locus that allowed separation of isolates into three groups. The isolates within each group may represent separate E. chaffeensis strains. Isolates from each group were associated with severe or fatal disease, and so it appears that multiple pathogenic strains of this bacterium exist. There were also other genomic differences that resulted from the loss or gain of long repeats within the coding sequences of 120-kDa and VLPT genes. These differences did not overlap with the genetic grouping of isolates established from the analysis of the 28-kDa OMP locus. The molecular heterogeneity in E. chaffeensis may have developed as a mechanism to evade the host immune response, or it could be the result of independently evolved genes. Genomic differences in the E. chaffeensis isolates may influence disease pathogenesis, a hypothesis that remains to be tested.
This paper is published as Kansas Agricultural Experiment Station Contribution number 02-497-5. ![]()
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»