Previous Article | Next Article ![]()
Infection and Immunity, December 2007, p. 5859-5866, Vol. 75, No. 12
0019-9567/07/$08.00+0 doi:10.1128/IAI.00709-07
Copyright © 2007, American Society for Microbiology. All Rights Reserved.
,
majs,1,2*
Petra Mat
jková,1,2
Erica Sodergren,2
Anita G. Amin,2
Jerrilyn K. Howell,3
Steven J. Norris,3 and
George M. Weinstock2
Department of Biology, Faculty of Medicine, Masaryk University, Kamenice 5, Building A6, 625 00 Brno, Czech Republic,1 Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Alkek N1619, Houston, Texas 77030,2 Department of Pathology and Laboratory Medicine, University of Texas-Houston Medical School, 6431 Fannin Street, Houston, Texas 770303
Received 25 May 2007/ Returned for modification 20 July 2007/ Accepted 8 September 2007
|
|
|---|
|
|
|---|
The T. pallidum subspecies and T. paraluiscuniculi cannot be distinguished by morphology, protein content, or physiology (12, 17), suggesting that they are closely related. Serum from rabbits infected with T. paraluiscuniculi cross-reacted with 21 of 22 proteins recognized by rabbit antibodies raised against T. pallidum subsp. pallidum (1). However, in rabbits (which are susceptible to both T. pallidum subsp. pallidum and T. paraluiscuniculi infection) there is no immunological cross-protection against these species (23, 25). In addition to a lack of cross-immunity, these bacterial species differ in their host specificity and the clinical manifestations of the diseases that they cause. Human syphilis is a sexually transmitted disease characterized by infection of a wide spectrum of tissues and organs, multiple stages, persistent infection for years to decades, and variouos clinical manifestations (18), whereas rabbit venereal spirochetosis is characterized by genital lesions (17). These findings suggest that there are important differences between the two species in terms of antigens and virulence factor expression. Genetic differences between T. pallidum subsp. pallidum and T. paraluiscuniculi must account for the observed differences in immunity, host specificity, and clinical manifestations.
Neither T. pallidum subsp. pallidum nor T. paraluiscuniculi has been cultured continuously in vitro, and this fact prevents the use of common molecular genetic approaches to study these pathogens. Sequencing and in silico analysis of the T. pallidum subsp. pallidum Nichols genome (8, 26) allowed comparison of these genomes by use of comparative genomics methods.
In this study we compared the genomes of T. pallidum subsp. pallidum Nichols and T paraluiscuniculi Cuniculi A using DNA microarray hybridization, whole-genome fingerprinting (WGF), and sequencing of chromosomal regions.
|
|
|---|
DNA labeling, microarray hybridization, and data analysis.
Preparations of T. pallidum subsp. pallidum and T. paraluiscuniculi chromosomal DNA (0.25 to 0.75 µg) were labeled fluorescently using the Klenow enzyme (New England Biolabs, Ipswich, MA) and random nonamers with a CyScribe First-Strand cDNA labeling kit (Amersham Pharmacia Biotech, Piscataway, NJ) according to the protocol described previously (24). Microarrays containing PCR products representing the 1,039 T. pallidum subsp. pallidum Nichols open reading frames (ORFs) were prepared as described by
majs et al. (24). The pretreated slides (24) were hybridized simultaneously with labeled DNA using the CyScribe First-Strand cDNA labeling kit (Amersham Pharmacia Biotech). Quantitation of hybridization, exclusion of outliers, and data normalization were performed using the TIGR Spotfinder and TIGR MIDAS software (21). Combining the results of four independent experiments, including dye swapping in two separate hybridizations, yielded 12 possible values for each gene. From these data points, average signal ratios (ASR) and standard deviations were calculated. From these data, a set of genes with mean ASR of labeled T. paraluiscuniculi Cuniculi A DNA to labeled T. pallidum subsp. pallidum Nichols DNA (ASRCuniculi A/Nichols) less than 0.7 (average log2 ratio less than –0.51) was derived. This set comprised 22 genes that are likely to contain deletions and/or major sequence changes. No genes with a mean ASR greater than 1.43 (average log2 ratio greater than 0.51) were identified.
WGF.
WGF was performed as described previously (27). The chromosomal DNA was amplified in 97 overlapping regions with a median length of 12,307 bp (range, 1,778 to 24,758 bp) using a GeneAmp XL PCR kit (Applied Biosystems, Foster City, CA). The primer pairs used for these amplifications are shown in Table S1 in the supplemental material. Each PCR product was digested with BamHI, EcoRI, or HindIII or combinations of these enzymes. To thoroughly assess the possible presence of deletions and insertions in restriction fragments, additional digestions were performed as needed to reduce the length of each restriction fragment to
4 kb. This was achieved by additional digestion with AccI, ClaI, EcoRV, KpnI, MluI, NcoI, NheI, RsrII, SacI, SpeI, XbaI, or XhoI (NEB) or combinations of these enzymes. The resulting fingerprints for T. pallidum subsp. pallidum Nichols were compared to those for the T. paraluiscuniculi Cuniculi A genome.
PCR amplification and DNA sequencing. Standard methods were used for PCR amplification from a chromosomal DNA template and agarose gel electrophoresis (22). For sequencing of PCR products, XL PCR was used to minimize the number of PCR errors. Oligonucleotide primers were designed with Primer3 software (20). The resulting PCR products were purified using a QIAquick PCR purification kit (QIAGEN) and were sequenced using a Taq DyeDeoxy terminator cycle sequencing kit (Applied Biosystems). Complete sequences of amplified regions were finished using specifically designed synthetic oligonucleotides as primers. Computer-assisted sequence analysis was performed using the LASERGENE program package (DNASTAR, Madison, WI). Three XL PCR products comprising regions TPI12, TPI25A, and TPI25B (see Table S1 in the supplemental material) were purified and subjected to mechanical shearing to obtain smaller fragments (0.5 to 1 kb) that were cloned into the pUC18 vector. The resulting recombinant plasmids (96 plasmids for each XL PCR product) of the small insert library were isolated and sequenced using forward and reverse pUC18 primers to obtain multiple coverage (i.e., 2 x 96 sequencing reactions per XL PCR product).
Nucleotide sequence accession numbers. The nucleotide sequences reported in this study have been deposited in the GenBank database under accession numbers EF057750, EF137736 to EF137743, and EF419245 to EF419253.
|
|
|---|
6 of the 12 reactions for 11 ORFs (TP0161, TP0224, TP0490, TP0573, TP0645, TP0753, TP0777, TP0795, TP0818, TP0932, and TP1032), and therefore analysis of these ORFs was not performed. All of these ORFs are relatively short (93, 105, 189, 93, 177, 285, 225, 159, 153, 93, and 432 bp, respectively) and code for a conserved hypothetical protein (TP0490) or hypothetical proteins (TP0161, TP0224, TP0573, TP0645, TP0753, TP0777, TP0795, TP0818, TP0932, and TP1032). Thus, data were calculated for 1,028 of 1,039 genes (99%) by determining the ASRCuniculi A/Nichols representing the average, normalized ratio of T. paraluiscuniculi Cuniculi A DNA fluorescent signals to T. pallidum subsp. pallidum Nichols DNA fluorescent signals for replicate spots on each microarray and in replicate experiments. A value of 1.0 corresponded to the mean signal for all genes of the array. The results of the DNA microarray hybridizations are shown in Table 1. Use of the Cuniculi A probe yielded significantly lower signals for 22 genes, with ASRCuniculi A/Nichols ranging from 0.14 to 0.7 (Table 1). These genes were not randomly distributed throughout the genome (Fig. 1) and tended to be clustered in regions containing tpr genes and genes in the vicinity of tpr genes. All but four putative genes (TP0128, TP0129, TP0896, and TP0970) belonged to paralogous gene family 2 (PGF2), PGF14, and PGF15. PGF2 represents tpr genes encoding Tpr proteins, which are T. pallidum-specific proteins of unknown function with similarity to the Treponema denticola membrane protein Msp (7). Eight tpr genes (tprC, tprD, tprF, tprG, tprI, tprJ, tprK and tprL) had significantly lower ASRCuniculi A/Nichols values, indicating that there were deletions or sequence diversity in the T. paraluiscuniculi Cuniculi A genes. In contrast, signals for the tprA, tprB, tprE, and tprH genes were similar in the two genomes, indicating that these genes are present in the Cuniculi A genome and that the Cuniculi A genes are highly homologous to their Nichols counterparts. Genes belonging to PGF14 and PGF15 encode hypothetical proteins with unknown functions encoded by genes in the vicinity of tpr genes. The ASRCuniculi A/Nichols values for PGF14 and PGF15 are considerably lower (range, 0.14 to 0.56) than the ASRCuniculi A/Nichols values for tpr genes (0.49 to 0.68). The average lengths of the tpr genes and the PGF14 and PGF15 genes listed in Table 1 were 1,779 bp and 658 bp, respectively, so the PCR products used in the microarray were longer for the tpr genes. Therefore, this approach may be less sensitive to detection of sequence changes when larger genes are examined.
|
View this table: [in a new window] |
TABLE 1. DNA microarray-predicted deletions and sequence changes in the T. paraluiscuniculi Cuniculi A genome: T. paraluiscuniculi ORFs with the lowest ASRCuniculi A/Nichols
|
![]() View larger version (8K): [in a new window] |
FIG. 1. Schematic representation of genes detected by lower-microarray-signal (DNA microarray) analysis, indels detected by WGF, and genes selected for sequencing in the T. paraluiscuniculi Cuniculi A genome. Deletions and sequentially diverse genes are indicated by vertical bars, and insertions are indicated by vertical lines ending with triangles. Open arrows indicate chromosomal regions that were sequenced in the Cuniculi A genome.
|
|
View this table: [in a new window] |
TABLE 2. Comparison of the T. pallidum subsp. pallidum Nichols and T. paraluiscuniculi Cuniculi A genomes using WGF and subsequent DNA sequencing of identified diverse regions: prominent insertions and deletions
|
|
View this table: [in a new window] |
TABLE 3. Comparison of the T. pallidum subsp. pallidum Nichols and T. paraluiscuniculi Cuniculi A genomes using WGF and subsequent DNA sequencing of identified diverse regions: major sequence changes (multiple SNPs) and frameshifts
|
Hypothetical proteins were characterized by searching the InterPro and Pfam databases and constructing hydrophobicity plots. Of the 25 hypothetical genes described in this study (Tables 1 to 3), 17 were completely sequenced in both the Cuniculi A and Nichols strains. The corresponding 17 hypothetical proteins were analyzed to predict cellular localization. Signal sequences were predicted for six Nichols proteins (encoded by TP0133, TP0134, TP0135, TP0136, TP0548, and TP0733) and five Cuniculi A proteins (encoded by TP0315, TP0470, TP0548, fused genes TP0617 and TP0618, and TP0733). In both strains, transmembrane regions were predicted for TP0733. Three putative protein domains (UPF0164, TPR, and DbpA) were found in five hypothetical proteins (encoded by TP0470, TP0548, TP0860, TP0865, and TP1029). No differences between the Nichols and Cuniculi A strains were found in domain distribution.
DNA sequencing of Cuniculi A chromosomal regions. To determine the level of sequence identity between the Nichols and Cuniculi A genomes, three chromosomal regions that also included IGR were sequenced. Analysis of these regions (5,289 bp; 0.46% of the genome), comprising genes TP0798 to TP0800 (accession number EF419251), TP0933 and TP0934 (accession number EF419252), and TP0961 and TP0962 (accession number EF419253), revealed 37 SNPs that resulted in 11 amino acid substitutions in the corresponding proteins. The average density of SNPs represented one nucleotide change per 143 bp (99.3% identity).
|
|
|---|
Compared to the WGF results, the DNA microarray approach identified five additional genes (TP0117, TP0462, TP0896, TP0897, and TP0970) with lower hybridization signals. Two of these genes belonged to PGF2, one belonged to PGF15, and two were unique (TP0896 and TP0970). Sequence diversity of these genes was identified as the reason for the lower hybridization signals on the DNA microarray. In these genes, the sequence diversity was dispersed throughout the entire genes and thus had the potential to affect hybridization to a DNA microarray (P. Mat
jková, unpublished results). DNA microarray and WGF approaches thus represent complementary methods; DNA microarray analysis allows selective detection of diverse chromosomal regions, and WGF allows selective identification of insertions within the genes and indels in intergenic regions.
Several of the observed indels and sequence changes were identified in the family of tpr genes (in 8 of 12 tpr genes). The T. pallidum repeat (tpr) genes encode paralogous proteins with sequence similarity to the major outer sheath protein (Msp) of T. denticola (7). The tpr genes are specific for T. pallidum and T. paraluiscuniculi, and several of them show heterogeneity both within and between the T. pallidum subspecies and strains examined (3, 4, 5). It is believed that the Tpr proteins are involved in pathogenesis and/or immune evasion. The TprK protein was found to induce a strong humoral and cellular immune response (3, 14, 15), and variable regions of TprK are responsible for the specificity of the antibody response (16). Moreover, sequences of variable regions of TprK change during infection and passage of T. pallidum subsp. pallidum strains (6) by a gene conversion mechanism with donor sites in the vicinity of tpr genes (e.g., in TP0137 and in TP0126 to TP0130). Thus, some of the observed genetic differences in the tprK locus of the T. paraluiscuniculi genome may also be due to this gene conversion mechanism. In addition, three new ORFs in the T. paraluiscuniculi genome with tprK-like sequences were identified.
With the exception of tpr genes, the TP0104 gene (5' nucleotidase), and the TP0545 gene (periplasmic galactose-binding protein), all other detected indels or sequence changes were localized in the genes encoding a conserved hypothetical protein (TP0470) or hypothetical proteins. The average transcription rate of these genes in T. pallidum subsp. pallidum cultivated in rabbit testes is considerably higher (1.74) than the average transcription rate of all genes of T. pallidum subsp. pallidum Nichols (1.0) (24). In addition, 8 of 29 (27.6%) of the proteins encoded by these genes were found to be recognized by serum antibodies derived from rabbits 84 days after infection with the Nichols strain (13). Both of these findings indicate that several of the putative genes identified are transcribed and translated and suggest that these T. pallidum subsp. pallidum genes are important during infection of rabbits. Most of these genes (17 of 29) were localized in the vicinity of tpr genes. Insertions identified in the T. paraluiscuniculi genome indicated that the sequences were tprK-like, tprA or tprB sequences, or unique sequences with no homologous sequences identified by the BLAST search. Deletion of the signal sequence peptide in MglB-1 encoded by TP0545 in the Cuniculi A strain may result in aborted export of this protein to the periplasm.
Seventeen hypothetical proteins were analyzed to predict cellular localization. Signal sequences were predicted in six and five proteins encoded in the Nichols and Cuniculi A genomes, respectively. Except for two hypothetical proteins (TP0548 and TP0733), signal sequences were predicted for different Nichols and Cuniculi A proteins. Possible localization of these proteins outside the cytoplasm may contribute to the different host ranges and pathogenicities of the Nichols and Cuniculi A strains.
A portion of the TPI12 region of the T. paraluiscuniculi genome sequenced in this study was nearly identical to a previously sequenced 2,792-nt region (accession number AY685237) comprising a nonfunctional tprD2 gene (9). Differences in 9 nt were found. Other regions of near identity with previously sequenced regions (9) were found in TPI2 and the accession number AY685232 sequence (tprA, 1,003 nt), in TPI2 and the accession number AY685233 sequence (tprB, 838 nt), in TPI25A and the accession number AY685239 sequence (nonfunctional tprG1, 3,255 nt), in TPI25B and the accession number AY685238 sequence (nonfunctional tprG and tprI, 2,449 nt), in TPI48 and the accession number AY685240 sequence (nonfunctional tprG2, 3,018 nt), and in TPI77 and the accession number AY685235 sequence (tprL, 1,331 nt). Within these regions, two, zero, seven, four, nine, and three nucleotide differences were found, respectively. These results could reflect differences accumulated in the Cuniculi A genome during independent cultivation in different laboratories; they potentially could also be due to PCR errors. It was previously shown that in the tprK locus (TP0897), sequence changes occurred during infection and passage of T. pallidum subsp. pallidum strain Chicago (6).
Altogether, 639 target restriction sites (representing 3.8 kb of the genomic sequence or 0.34% of the Nichols genome) in the Cuniculi A genome were analyzed with three enzymes (BamHI, HindIII, and EcoRI). Assuming that the majority of additional or missing restriction target sites were due to single nucleotide changes, the sequence similarity of the Cuniculi A and Nichols genomes could be predicted to be 98.6%. Sequencing of three chromosomal regions representing 0.46% of the Cuniculi A genome revealed a sequence identity of 99.3%. However, the latter result is a rather high estimate of the sequence identity because the value could be distorted by a number of factors, including nonrandom distribution of sequenced DNA and the fact that the sequentially divergent regions in the Cuniculi A strain appear to be localized in certain chromosome regions.
The data presented indicate that the genomes of T. pallidum subsp. pallidum and T. paraluiscuniculi are very closely related and that most of the observed differences are localized in tpr loci and in the vicinity of these loci, suggesting their possible role in the host range and pathogenicity of T. pallidum subsp. pallidum. The high degree of sequence similarity of the genomes tested could be used for planning an optimal genome sequencing strategy. In further studies, the high level of relatedness of the T. pallidum subsp. pallidum, T. pallidum subsp. pertenue, and T. paraluiscuniculi genomes could be used for identifying and deciphering T. pallidum subsp. pallidum virulence determinants.
This work was supported by Public Health Service grants to G.M.W. (grants R01 DE12488 and R01 DE13759) and S.J.N. (grants R01 AI49252 and R03 AI69107) and by grants 310/04/0021 and 310/07/0321 from the Grant Agency of the Czech Republic, grant NR8967-4/2006 from the Ministry of Health of the Czech Republic, and grant VZ MSM0021622415 from the Ministry of Education of the Czech Republic to D.S.
Published ahead of print on 24 September 2007. ![]()
Supplemental material for this article may be found at http://iai.asm.org/. ![]()
|
|
|---|
majs, D., M. McKevitt, J. K. Howell, S. J. Norris, W. W. Cai, T. Palzkill, and G. M. Weinstock. 2005. Transcriptome of Treponema pallidum: gene expression profile during experimental rabbit infection. J. Bacteriol. 187:1866-1874.
majs. 2000. Identification of virulence genes in silico: infectious disease genomics, p. 251-261. In K. A. Brogden, J. A. Roth, T. B. Stanton, C. A. Bolin, F. C. Minion, and M. J. Wannemuehler (ed.), Virulence mechanisms of bacterial pathogens, 3rd ed. ASM Press, Washington, DC.This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»