Previous Article | Next Article ![]()
Infection and Immunity, November 2006, p. 6429-6437, Vol. 74, No. 11
0019-9567/06/$08.00+0 doi:10.1128/IAI.00809-06
Copyright © 2006, American Society for Microbiology. All Rights Reserved.
,
Department of Infectious Diseases and Pathology,1 Department of Physiological Sciences, University of Florida, Gainesville, Florida 32611,2 Norwegian School of Veterinary Science, Sandnes N-4325, Norway,3 Department of Clinical Microbiology, Kalmar County Hospital, Kalmar, Sweden,4 Department of Wildlife, Humboldt State University, Arcata, California 95521,5 Department of Medicine and Epidemiology, School of Veterinary Medicine, University of California, Davis, California 956166
Received 19 May 2006/ Returned for modification 26 June 2006/ Accepted 30 August 2006
|
|
|---|
|
|
|---|
Although A. phagocytophilum has been less well studied, there are some similarities between the two organisms. A. phagocytophilum also expresses a major surface protein that is an ortholog of MSP2, which we will term here MSP2 (P44) (25, 33, 40). Like MSP2 of A. marginale, MSP2 (P44) contains a central hypervariable region and is expressed from a genomic expression site subject to frequent gene conversion by sequence duplicated from msp2 (p44) copies elsewhere in the genome (4, 27). The promoter regions in the two expression sites have nearly identical sequences (2). There are also differences between the two systems. In the A. phagocytophilum genome, there are many more msp2 (p44) pseudogene copies, about 100 compared to about 7 in A. marginale (6, 14). Also, in the variants analyzed to date, there appears to be less use of small donor gene segments to create mosaic complexity in A. phagocytophilum than that in A. marginale. In fact, the expression site hypervariable regions have closely resembled those in the donor copies, except at the borders of the converted segments (4, 26, 27). However, as yet, relatively few expression site variants of A. phagocytophilum have been analyzed, and these have all been derived from human strains. The purpose of this study was to investigate the structure and diversity of the msp2 (p44) expression site in a greater range of animal species in order to ascertain whether similar persistence mechanisms could operate in the full range of infections caused by A. phagocytophilum.
|
|
|---|
Determination of the structure of the msp2 (p44) expression site. The sequence of the msp2 (p44) expression site in U.S. strains from humans has been defined previously (4). Initial attempts to amplify the msp2 (p44) gene in the expression site using the oligonucleotide primers AB 1000 (CCGGCTGAAGTGAGGAGACGA) and AB 1001 (AAGTACCGCAGGAAGTAGAAT), defined previously, were successful for all U.S. strains but unsuccessful with the European samples. We then used different combinations of primers and also a primer-walking strategy (DNA walking SpeedUp kit; Seegene USA, Rockville, MD) to obtain provisional sequences for the expression sites from the European strain. These sequences differed from the U.S. strain sequence, particularly in regions immediately flanking the msp2 (p44) gene in the expression site, but did contain these conserved features: an upstream gene similar to p44ESup1, a nearly identical 5' promoter, and a 3'-truncated recA gene (Fig. 1). To confirm and correct this provisional sequence, PCR amplification with oligonucleotide primers AB 1207 (GGGAGTGCTCTGGTTAGATTTAGG) and AB 1198 (GCGAGGAAGCAATGAGAATAG) was used to amplify a 3.1-kb fragment containing p44ESup1, msp2 (p44), and truncated recA. This fragment was gel isolated, cloned in the plasmid vector pCR4-TOPO (Invitrogen, Carlsbad, CA), and sequenced on both strands. The promoter region, which has proven unstable and difficult to clone in previous studies (2), was PCR amplified separately with primers AB 1041 (ATGTCAGTACCGGCATATCTTGAAATC) and AB 1200 (GCATAGAACCCATCGGCTTCAC) to yield an overlapping 331-bp fragment that was sequenced directly without cloning.
![]() View larger version (29K): [in a new window] |
FIG. 1. Msp2 expression site variability in three U.S. and two European strains of A. phagocytophilum. PLOTSIMILARITY comparison of U.S., Norwegian, and Swedish strains (top panel) or three U.S. strains (bottom panel). The positions of coding sequences for truncated RecA, MSP2 (P44), P44ESup1, and the promoter (P) are shown above the PLOTSIMILARITY profiles. A similarity score of 1.0 indicates identical sequence in a sliding window of 10 nucleotides, and a decreasing score from 1.0 to 0 indicates increasing variation. Sequences of U.S. strains NY-18, NY-37, and HZ are from GenBank AY164490, AY137510, and CP000235, respectively.
|
2 kb (Fig. 1). In all cases, PCR-amplified fragments were cloned into the plasmid vector pCR4-TOPO and individual colonies were grown overnight in 96-well deep blocks containing 1.5 ml LB medium and kanamycin (50 µg/ml). Plasmid DNA was prepared from cultures by centrifuging blocks at 1,100 x g for 15 min, resuspending cells in 400 µl of 50 mM Tris-HCl, pH 8.0, 10 mM EDTA, 100 µg/ml RNase A, and then adding 400 µl of 200 mM NaOH, 1% sodium dodecyl sulfate and inverting five times to lyse cells. Four hundred microliters of 3 M sodium acetate, pH 4.5, was added, and the plates were covered with sealing tape and inverted five times. The blocks were placed at 80°C for 1 to 2 h and then thawed and centrifuged for 30 min at 2,830 x g at 4°C. Nine hundred microliters of cleared supernatant was transferred to a 96-well filter plate (Unifilter; Whatman, Clifton, NJ) on a new 96-well block. An equal volume of isopropanol was added to each well, and the block was covered with sealing tape and inverted 1 to 2 times before being centrifuged for 45 to 60 min at 2,830 x g. The supernatant was carefully decanted, and pellets of DNA were washed with 1 ml 70% ethanol and air dried before resuspension in TE buffer (10 mM Tris-HCl, 1 mM EDTA, pH 8.0). All samples were digested with EcoRI to release insert DNA and with EcoRI and RsaI to analyze restriction fragment length polymorphism (RFLP) patterns. Digested DNA was resolved by electrophoresis on 1.5% agarose gels and visualized by staining with GelStar nucleic acid stain (Cambrex, Rockland, Maine). Clones were selected for DNA sequencing based on the different RFLP patterns obtained. Representative clones were selected for sequencing from each of the major RFLP patterns, excluding singlet patterns. The analyses described here included RFLP analysis from expression site clones (469 infected human, 342 HGE2 culture, 311 NY-18 culture, 36 Webster culture, 218 infected wood rat, 20 infected bear, 229 infected Swedish dog, and 67 infected Norwegian sheep) of A. phagocytophilum. Sequencing was performed at the University of Florida DNA Sequencing Core Laboratory (Gainesville, FL) by using ABI Prism dye terminator cycle sequencing protocols developed by Applied Biosystems (Foster City, California). The fluorescently labeled extension products were sequenced on both strands by using PerkinElmer/Applied Biosystems automated DNA sequencers. A minority of the sequences obtained contained small changes in amino acid sequence, such as a single amino acid substitution, or an apparent deletion that resulted in no open reading frame through the msp2 (p44) hypervariable region. For the purpose of the analyses described here, we did not attempt to determine whether these changes resulted from differences in the original template DNA or were introduced as a result of PCR amplification, clone instability, or sequencing errors that were reproduced on both strands. However, an arbitrary decision was made to include in the alignments all variants with minor amino acid substitutions, so that identities with genome copies would not be missed. Likewise, we excluded from the analysis any expression site sequence without a complete open reading frame through the hypervariable region. Such sequences have been observed previously in the syntenic A. marginale expression site (5), however, and could be a natural result of frequent pseudogene recombination into the msp2 (p44) expression site. Sequence analyses and alignments. Nucleotide sequences were analyzed using the Wisconsin Package, version 10.3-Unix, (Accelrys, Inc., San Diego, CA), running under SunOS 5.8, available through the Biological Computing Core facilities of the Interdisciplinary Center for Biotechnology Research at the University of Florida. Sequence alignments were made initially using PILEUP, and similarities were displayed using PLOTSIMILARITY and PRETTYBOX. Ka:Ks ratios (nonsynonymous substitutions per nonsynonymous site to synonymous substitutions per synonymous site) were determined with DIVERGE. To align multiple sequences from the sequenced HZ strain genome with expression site sequences (see Fig. S1 in the supplemental material), the annotated msp2 (p44) genomic loci in GenBank CP000235 were first extracted as a multiple Fasta file using ARTEMIS (38) (http://www.sanger.ac.uk/Software/Artemis/) and aligned with DIALIGN (32), running on a Relion 130 linux server (Penguin Computing, San Francisco, CA). This effectively identified the presence of the conserved regions flanking the hypervariable region in genomic open reading frames of very different lengths. The genome and expression site sequences were then trimmed to the LAKT residues present on both sides of the hypervariable regions and aligned with PILEUP. The aligned multiple sequence file was visualized with CHROMA (23) (http://www.llew.org.uk/chroma). To determine the relationships between the aligned sequences in terms of percent amino acid sequence identities, a Fasta file containing all trimmed genome and expression site sequences was analyzed with MatGAT (Matrix global alignment tool) (10). The percentage identities determined by MatGAT for an all-against-all comparison were exported to an Excel spreadsheet for final determination of those alignments meeting or exceeding cutoff values of 70 or 90% identity. The final alignments analyzed (see Fig. S1 in the supplemental material) contained 94 A. phagocytophilum genomic loci of msp2 (p44), 74 expression site variants of U.S. origin (4 bear, 19 wood rat, 25 cultured human origin, and 26 human), 19 Swedish dog variants, and 23 Norwegian sheep variants.
Nucleotide sequence accession numbers. The nucleotide sequences reported here have GenBank accession numbers DQ519449 through DQ519570. The msp2 (p44) variable region sequences present in the expression site (see Fig. S1 in the supplemental material) are DQ519449 through DQ519564, sequences of the msp2 (p44) expression site in the Norwegian sheep and Swedish dog strains are DQ519565 and DQ519566, and those of complete msp2 (p44) genes in the expression site (variants B1v1, D2v1, Sh1t1v4, and WR1v2) are DQ519567 through DQ519570.
|
|
|---|
Figure 2 compares the amino acid sequences of P44ESup1 and MSP2 (P44) from U.S. and European strains of A. phagocytophilum and from strains infecting different animal hosts. Overall, the structures of P44ESup1 and MSP2 (P44) were well conserved. The sequence of P44ESup1 derived from a strain infecting Norwegian sheep differed by 53 amino acid substitutions from a strain infecting humans in the United States. These changes were distributed throughout the molecule. The strain of A. phagocytophilum from infected dogs in Sweden differed by 16 amino acid substitutions from the U.S. sequence. Despite these amino acid sequence changes, the Ka:Ks ratios for the P44ESup1 nucleotide sequences were 0.179 to 0.272, which may suggest evolutionary selection pressure for conservation of this gene.
![]() View larger version (49K): [in a new window] |
FIG. 2. Variation in the amino acid sequences of P44ESup1 (A) or MSP2 (P44) (B) between U.S. and European strains of A. phagocytophilum and strains infecting different mammalian host species (ushuman, U.S. human; uswrat, U.S. wood rat, Neotoma fuscipes; usbear, U.S. bear, Ursus americanus; swdog, Swedish dog, Canis familiaris; and norsheep, Norwegian domestic sheep, Ovis aries). The sequence of a strain infecting a human patient in the United States is from GenBank AY164493; all other sequences are from this study.
|
Diversity in the central hypervariable region of MSP2 (P44) among strains. The previous data demonstrated that, like U.S. strains of A. phagocytophilum derived from human infections, A. phagocytophilum strains from Europe and strains infecting diverse animal species differed primarily in a region of the msp2 (p44) expression locus encoding about 120 amino acids in the middle of the MSP2 (P44) protein (Fig. 2B). To examine the diversity of this region more closely, independent clones of the msp2 (p44) expression site were prepared and sequenced from all A. phagocytophilum strains. The central hypervariable regions encoding MSP2 (P44) and flanking conserved sequences from all clones were aligned (see Fig. S1 in the supplemental material). We also included hypervariable region sequences from our previous studies of this expression site, allowing the alignment of 116 total sequences derived from U.S. human infections and human strains grown in culture, as well as from the A. phagocytophilum strains infecting European dogs and sheep and U.S. bears and wood rats. The alignment showed the same flanking conserved regions to be present in most variants and that many different variants were present in each population analyzed. These variants were generally different between individual animals and even in the same animal when samples were available at multiple time points of a single infection. These data suggest (i) that similar antigenic variation mechanisms through expression site recombination are available to all strains of A. phagocytophilum and (ii) that there is great diversity in the potential repertoire of expressed sequences. The alignment also reveals some of the structural constraints on this variation, which include amino acids with conserved general structural properties, such as polarity and size, surrounding the highly conserved cysteine, phenylalanine, and tryptophan residues (see consensus sequence line in Fig. S1 in the supplemental material) recognized previously (29).
We employed MatGAT (10) to analyze the extent of expression site diversity. MatGAT generates pairwise identity matrices from an all-against-all comparison of the 116 aligned sequences. Excluding self-against-self comparisons, this MatGAT analysis included a comparison of 6,670 individual pairwise alignments having an average of 50.5% identity over the region analyzed (LAKT to LAKT). Of these 6,670 alignments, 68 (1%) had >90% identity and 106 (1.6%) had >70% identity. The small number of similar alignments suggests enormous potential global diversity in the hypervariable region sequences of expressed MSP2 (P44). Of the 68 alignments with >90% identity, 51 were among variants from different U.S. origin populations, 14 were among variants from strains infecting different Swedish dogs, and 3 were among strains infecting different Norwegian sheep. Therefore, some similar variants were observed in populations derived from humans, wood rats (a potential reservoir species; for details, see references 17 and 18), and a bear in the United States and also in organisms derived from humans and grown in culture. However, there was no sharing at
90% identity between the European and U.S. strains, or even between the strains from Norway and Sweden. MSP2 (P44) paralogs in the databases have not generally been identified as to their genomic origin (i.e., expression site versus pseudogene). However, we also compared the 116 expression site sequences from this study with all available paralogs in the UniProt/Swiss-Prot database (Batch BLAST; http://services.nbic.nl/bb/cgi-bin/bb_search.cgi). Zero of 19 Swedish dog variants and 5 of 23 Norwegian sheep variants matched a database paralog at
90% identity. Four of the five Norwegian sheep expression site variants matched paralogs previously identified in strains of A. phagocytophilum in sheep from the United Kingdom (11).
The complete genome sequence of a U.S. (human-derived) strain, HZ, of A. phagocytophilum has recently been published (14). A total of 113 p44 loci were annotated, but some of these are truncated or are short 5' or 3' gene fragments. Within these 113 loci, we identified 94 that had at least one of the conserved regions flanking the central hypervariable region that previous data (28) suggest is required for gene conversion. The presence of at least one of these flanking conserved regions also enables a reasonable alignment to be made. A MatGAT comparison was then conducted that included these 94 genomic p44 loci identified in the complete A. phagocytophilum genome sequence as well as the 116 expression site variants from the United States and Europe (see Fig. S1 in the supplemental material). We identified those pairwise alignments having 100 and
90% identity, in order not to miss otherwise good matches that differed by only a few amino acids. With the exception of wood rat variants, we could match nearly all U.S.-origin A. phagocytophilum expression site variants with a locus in the sequenced HZ strain (Table 1). There were many 100% identities between the different genomic copies and U.S. human or human-derived culture variants (see Table S1 in the supplemental material). In addition, some A. phagocytophilum variants infecting wood rats or bears in the United States matched at
90% identity with known genomic copies of the HZ strain. However, this was not the case with the European variants. No alignments were found having
90% identity between the genomic loci and the variants identified in A. phagocytophilum infections of Norwegian sheep. There was one match at
90% identity between a Swedish dog variant (D1v1) and a (U.S.) genomic copy (aph_1128). An examination of this alignment (see Fig. S1 in the supplemental material) reveals a very similar hypervariable region that likely results from limited evolutionary divergence of the same donor pseudogene copy. Despite the general dissimilarity between genomic copies of the U.S. strain and European expression site variants, the overall structure of the central hypervariable region was maintained in the European strains, in terms of the known signature conserved residues (29). This suggests that while an analogous variation mechanism is used in the European strains, there is diversity in the genomic donor pseudogene repertoire used for gene conversion of the expression site.
|
View this table: [in a new window] |
TABLE 1. Number of shared expression site variants with HZ strain genomic loci
|
|
View larger version (18K): [in a new window] |
FIG. 3. Shared hypervariable region sequence between MSP2 (P44) expression sites in A. phagocytophilum infecting wood rats. Blood samples were obtained from different wood rats at a field site in Hoopa Valley, California. Shared sequences in the hypervariable region are colored similarly: red, 89% identity; yellow, 76, 73, and 72 identity; green, 63% identity; blue, 49% identity.
|
|
|
|---|
Out of 6,670 pairwise alignments of expression site variants from the United States and Europe, only about 1% had significant identity (defined here as
90% identity in the hypervariable region). None of the pairwise alignments with
90% identity were between A. phagocytophilum isolates of U.S. and European origin or between isolates of Swedish and Norwegian origin. This suggests that the repertoire of outer membrane protein variants that can potentially be expressed glob ally by A. phagocytophilum is very large. The great flexibility of this organism, with respect to its genome structure and its ability to infect diverse vertebrate and invertebrate hosts, is unusual for an organism of only 1.47 Mb total genome size.
In an analysis of 116 total expression site variants encoding MSP2 (P44), we identified matches with
90% identity to 36 genomic loci of the sequenced U.S. HZ strain of A. phagocytophilum and matches of 100% identity to 18 loci. These loci have been annotated as two different gene classes, either full-length or silent/reserve genes (14). The presence of sequence from both classes in the expression site supports the previous suggestion that either class can donate sequence to the expression site to generate outer membrane protein diversity (14). The expression site variants that could be matched to genomic loci were uniformly derived from A. phagocytophilum of U.S. origin, with the single exception of one variant of Swedish origin. This suggests marked diversity in the strains of A. phagocytophilum isolated in Europe compared to that in North American strains. It also supports previous microarray data showing that all but four msp2 (p44) genomic copies were present in three U.S. strains from different geographic locations (14). However, these analyses are still limited to relatively few populations of A. phagocytophilum and have not yet included, for example, populations circulating in white-tailed deer in the United States that are thought not to be associated with human infection (30).
Only 5/19 wood rat variants could be matched to known genome loci, in contrast to the matching of nearly all variants isolated directly from humans or bears or grown in culture. The human variants were derived from acute, clinically symptomatic cases. Although the times of infection of the wood rats are not known, these animals undergo persistent infections with A. phagocytophilum and have been implicated as a potential reservoir species (17). Therefore, it is likely that the wood rat samples contain organisms more representative of long-term infections. The shared blocks of sequence found throughout the hypervariable region in variants derived from different wood rats suggest that segmental gene conversion may introduce more-complex mosaics into the expression site, as has been shown in A. marginale (9). If it is similar to A. marginale, the basic pseudogene repertoire may be expressed earlier in infections without extensive recombination among hypervariable region sequences (21). Therefore, resolving this question definitively will require the analysis of long-term infections in reservoir hosts experimentally infected with a genome-sequenced strain of A. phagocytophilum.
Unlike wood rats, sheep, and dogs, where persistent infections have been clearly demonstrated (15, 18, 39), it is unclear if persistent infections are a feature of the disease in humans. However, this possibility should be considered in light of the extensive structural diversity demonstrated here, the long-term adverse health outcomes of A. phagocytophilum infection (36), and the high rate of serological positivity (15 to 36%) of people in areas of endemicity (13). Another important question is whether antibiotic treatment, as currently prescribed, completely sterilizes the infection.
Despite the expression site diversity seen in this study, it is clearly not limitless even within the central hypervariable region of MSP2 (P44). In addition to the conserved "signature" residues KVC, C, F, and NWPT, there are conservative replacements elsewhere with respect to charge or size that presumably form the necessary framework for correct folding of the protein on the surface of the organism. The different hypervariable regions could have additional functions beyond immune evasion, such as in binding to and invading different cell types. There is evidence that antibodies to MSP2 (P44) and recombinant MSP2 (P44) antagonize adhesion of A. phagocytophilum to granulocytes (13, 35). The binding of bacteria to particular cell types may depend on the expression of particular MSP2 (P44) variants. There are now many orthologs and paralogs of MSP2 (P44) known in different Anaplasma and Ehrlichia species that form Pfam01617. No three-dimensional structure is known for any of these; therefore, an important priority for future research should be to crystallize these outer membrane proteins and obtain complete structures. This would allow a better interpretation of the biological significance of the most variable and more conserved framework residues. The data presented here should allow the correlation of sequence changes with structure and with effects on immune evasion and host cell binding.
In summary, A. phagocytophilum organisms infecting diverse animal species and humans in the United States and Europe appear to possess and utilize similar genomic expression loci for generating outer membrane protein diversity. Although there has been some divergence in this locus between U.S. and European strains, fundamental structural features are conserved. There is significant global diversity in MSP2 (P44) generated through recombination into this locus. It may represent an ongoing adaptation to cell invasion and survival in different hosts as well as an obstacle to the control of this persistent infection.
We thank Robert Massung of the Centers for Disease Control, Atlanta, GA, for assistance with the provision of DNA samples from European sources of A. phagocytophilum and Ulrike G. Munderloh and Susan J. Wong for provision of samples from cultures of the HGE2 strain and from infected patients.
Published ahead of print on 11 September 2006. ![]()
Supplemental material for this article may be found at http://iai.asm.org/. ![]()
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»