Previous Article | Next Article ![]()
Infection and Immunity, November 2005, p. 7180-7189, Vol. 73, No. 11
0019-9567/05/$08.00+0 doi:10.1128/IAI.73.11.7180-7189.2005
Copyright © 2005, American Society for Microbiology. All Rights Reserved.
Program in Vector-Borne Diseases, Department of Veterinary Microbiology and Pathology, Washington State University, Pullman, Washington 99164-7040,1 Emerging Technologies/Delivery, Queensland Department of Primary Industries and Fisheries, c/o Locked Mail Bag No. 4, Moorooka 4105, Queensland, Australia,2 Tick Fever Centre, Biosecurity, Queensland Department of Primary Industries and Fisheries, 280 Grindle Road, Wacol, Queensland, Australia3
Received 1 July 2005/ Returned for modification 12 August 2005/ Accepted 20 August 2005
|
|
|---|
|
|
|---|
Although attenuated vaccines generally provide protection against bovine babesiosis, disease outbreaks involving heterologous isolates occur (4, 5). These heterologous field isolates, termed vaccine breakthrough isolates, are genetically distinct from the vaccine strain used to immunize animals (13). However, the composite genetic and antigenic changes that result in vaccine breakthrough are not defined. LeRoith et al. recently showed that significant MSA-1 variation was present in every vaccine breakthrough isolate examined (12). Based on studies using American strains (defined by geographic origins), MSA-2 has been postulated to be less divergent despite apparent intragenic sites of genetic exchange (10). However, because of their familial relationship, we hypothesized that MSA-2 would demonstrate similar sequence divergence when examined in isolates of B. bovis able to escape vaccine-induced immunity. Further, this divergence was postulated to arise in part from genetic exchange among VMSA family members. To test these hypotheses, we characterized the complete msa-2 locus from 12 Australian strains and isolates, including two vaccine strains and eight vaccine breakthrough isolates, and compared the msa-2 genes to the two previously characterized loci (10) as well as one new American isolate. The results identify a unique structure of the msa-2 locus in Australian strains and isolates and confirm the hypothesis that significant sequence variation occurs in breakthrough isolates. While genetic exchange does appear to contribute to sequence diversity, unique degenerate nucleotide repeats encode much of the variation in a proline-rich hypervariable region (HVR) occupying the carboxy third of MSA-2a/MSA-2b proteins.
|
|
|---|
|
View this table: [in a new window] |
TABLE 1. Strains and isolates used in msa-2 locus and gene analyses
|
|
View this table: [in a new window] |
TABLE 2. Oligonucleotides used for the amplification of msa-2 loci and msa-2 genes and Southern blot experiments
|
The cloned loci, except for T2Bo, were sequenced by subcloning four overlapping PCR amplicons that spanned the original insert. The msa-2c and -2a/b genes were amplified from plasmids containing complete loci using the primer sets described below and cloned into pCR-4-TOPO. The intergenic region between msa-2c and -2a/b and the intergenic region between msa-2a/b and orfB were amplified from plasmid DNA using primer sets B42/44-R-f with msa-2-F1-r and B42/44-R-f with orf-b-F3, respectively. Miniprep DNA was sequenced using ABI chemistries and analyzed on an ABI3100 genetic analyzer (Applied Biosystems, Foster City, CA). The four subclone sequences were compiled into a single contiguous sequence using ContigExpress (Invitrogen). The T2Bo msa-2 locus sequence was derived from shotgun sequencing of a BAC clone (1K11) generated from a genomic DNA library of the T2Bo strain (www.vetmed.wsu.edu/research_vmp/babesia-bovis), followed by verification via subcloning and sequencing.
Amplification and sequencing of msa-2 genes from B. bovis isolates and strains. msa-2a and -2b of the T2Bo strain and msa-2a/b of strains or isolates L, S, F28, F35, T (T-1 msa-2a/b), and G06 were amplified from genomic DNA using primer set msa-2-F1 and B42/44-R with Taq DNA polymerase (Invitrogen). msa-2a/b of the K strain was amplified from genomic DNA using primer set msa-2-F1-k and B42/44-R. msa-2c of the K, S, L, T, and T2Bo strains and the F and G isolates was amplified with primer set msa-2c-F and B42/44-R. Additional msa-2a/b genes were amplified from cloned msa-2 loci (see above). msa-2a/b of isolates G36 and F35 was amplified from locus clones with primer set msa-2-F1 and B42/44-R. An msa-2a/b gene from the T strain (sequence T-2 msa-2a/b) and msa-2a/b from isolates G06, G51, G52, F3, and F40 were amplified from locus clones with primer set aus-IR-F and B42/44-R (primer sequences shown in Table 2). Amplicons were cloned into pCR-4-TOPO (Invitrogen). Screening for positive clones, end sequencing, and analyses were done as described above. Nine additional T msa-2a/b clonesamplified with forward primer msa-1-F1, aus-IR-F, or msa-1-F1-k and reverse primer B42/44were sequenced and nearly identical to either T-1 or T-2 msa-2a/b. Similarly, nine additional K msa-2a/b clonesamplified with forward primer msa-1-F1-k or aus-IR-F and reverse primer B42/44were sequenced and nearly identical to K msa-2a/b.
The protein secondary structure of MSA-2a/b was analyzed by SSpro (www.igb.uci.edu/tools/scratch/) and nnPredict (www.cmpharm.ucsf.edu/
nomi/nnpredict.html).
GenBank sequence accession numbers. GenBank accession numbers of the previously published genomic DNA sequences from which protein sequences were derived are as follows: Mo7 msa-2 locus, AY052538; R1A msa-2a1, AY052539; R1A msa-2a2, AY052540; R1A msa-2b, AY052541; and R1A msa-2c, AY052542.
Nucleotide sequence accession numbers. GenBank accession numbers of the new genomic DNA sequences from which protein sequences were derived are as follows: F28 msa-2a/b, DQ173946; F3 msa-2a/b, DQ173947; F35 msa-2a/b, DQ173948; F40 msa-2a/b, DQ173949; G06 msa-2a/b, DQ173950; G36 msa-2a/b, DQ173951; G51 msa-2a/b, DQ173952; G52 msa-2a/b, DQ173953; K vaccine strain msa-2a/b, DQ173954; L strain msa-2a/b, DQ173955; S strain msa-2a/b, DQ173956; T vaccine strain msa-2a/b T-1, DQ173957; T vaccine strain msa-2a/b T-2, DQ173958; T2Bo strain msa-2a1, DQ173959; T2Bo strain msa-2a2, DQ173960; T2Bo strain msa-2b, DQ173961; F28 msa-2c, DQ173962; F3 msa-2c, DQ173963; F35 msa-2c, DQ173964; F40 msa-2c, DQ173965; G06 msa-2c, DQ173966; G36 msa-2c, DQ173967; G51 msa-2c, DQ173968; G52 msa-2c, DQ173969; K vaccine strain msa-2c, DQ173970; L strain msa-2c, DQ173971; S strain msa-2c, DQ173972; T vaccine strain msa-2c, DQ173973; T2Bo strain msa-2c, DQ173974; F3 msa-2 locus, DQ173975; F35 msa-2 locus, DQ173976; F40 msa-2 locus, DQ173977; G06 msa-2 locus, DQ173978; G36 msa-2 locus, DQ173979; G51 msa-2 locus, DQ173980; G52 msa-2 locus, DQ173981; L strain msa-2 locus, DQ173982; S strain msa-2 locus, DQ173983; T vaccine strain msa-2 locus, DQ173984; T2Bo strain msa-2 locus, DQ173985.
|
|
|---|
![]() View larger version (31K): [in a new window] |
FIG. 1. The Australian virulent B. bovis L strain contains a single msa-2a/b gene. (A) Ethidium bromide-stained agarose gel of the msa-2 locus amplified from biological clone Mo7 (lane 2), Texas T2Bo strain (lane 3), and Australian virulent L strain (lane 4) with a 1.0-kb marker in lane 1. (B) Schematic of the msa-2 locus from T2Bo, based on the sequence of BAC clone 1K11 (see Materials and Methods), and L, based on the nucleotide sequence of the amplified msa-2 locus. (C) Southern blot of T2Bo genomic DNA and a T2Bo msa-2 locus clone digested with PstI and probed with digoxigenin-labeled T2Bo msa-2a/b-specific oligonucleotide (left panel) and L strain genomic DNA and L msa-2 locus clone digested with EcoRI/HindIII and probed with digoxigenin-labeled L msa-2a/b-specific oligonucleotide (right panel). Probe detection was by chemiluminescence. Size markers are in the far left and far right lanes, and the relevant markers are designated in base pairs to the left and right of the panels.
|
The msa-2 loci from the Australian T vaccine strain, vaccine breakthrough isolates of the F and G series (F3, F35, F40, G06, G36, G51, and G52), and the virulent S strain were amplified from genomic DNA and produced, similarly to the L strain, a 3.2-kb amplicon. All loci have an organization identical to that of the L strain, with one msa-2c gene at the 5' end of the locus and one additional msa-2a/b gene located 3' of msa-2c.
Analysis of MSA-2a/b protein sequences. msa-2a/b genes were sequenced from the virulent Australian L and S strains, the Australian K and T vaccine strains, several Australian vaccine breakthrough isolates (F and G), and the T2Bo strain (Table 1). These sequences along with the msa-2a/b genes from the South American R1A vaccine strain and the Mo7 lab strain were aligned and analyzed. The MSA-2a/b molecule can be split into four different regions: the leader sequence (M1 to S20), the body (A21 to X200), the HVR (P201 to X300), and the GPI anchor signal sequence (X301 to F317) (Fig. 2). The leader and GPI anchor signal sequences, both of which are predicted to be absent from the surface-expressed protein, are the only highly conserved regions of the protein, with 90% and 88% overall identity among all sequences, respectively. The body of the molecule is less conserved but has several small islands of complete identity. Notable is a 34-residue block (Y151 to M184, 62% identity) that contains one of the most hydrophilic regions of the molecule (10). The block has three stretches of absolute conservation: YYK, VKFCND, and SPFM, as identified in previously examined American strains (10). Additional short regions of relative conservation are present in the amino half of the body and include a 16-residue stretch (T32 to T48) with 59% identity. However, the conservation of 129FNAFLNDNP137, found among all R1A and Mo7 MSA-2a/b sequences (10), is not maintained in the Australian strains and isolates. In contrast to the body of the molecule, the HVR is a highly divergent region with few conserved residues and is analyzed in depth below.
![]() View larger version (18K): [in a new window] |
FIG. 2. The MSA-2a/b proteins are diverse and have a hypervariable region in the carboxy half of the molecule. An alignment similarity plot of all examined MSA-2a/b proteins is shown. Regions of relative high identity are indicated by an overlying bar with sequence information and composite percent identity provided. Dashes in sequences indicate nonconserved residues. A schematic of MSA-2a/b molecular organization based on sequence variation and known function is shown below. Protein residues noted at the junction of sequence segments are initiating residues.
|
![]() View larger version (37K): [in a new window] |
FIG. 3. Vaccine breakthrough isolate MSA-2a/b protein sequences differ from their associated vaccine strains. (A) MSA-2a/b sequence alignment of the Australian T vaccine strain compared to the vaccine breakthrough G isolates (G52, G51, G36, and G06). (B) MSA-2a/b sequence alignment of the Australian K vaccine strain compared to the vaccine breakthrough F isolates (F3, F28, F35, and F40). The underlined sequences indicate stretches of complete conservation. (C) MSA-2a/b amino-terminal region alignment of T vaccine strain T-1, vaccine breakthrough isolate G06, and Mo7 2a1. Boxes highlight regions of high identity between G06 and Mo7 2a1. Spaces denote identity, and dashes indicate gaps. Alignments were constructed by CLUSTALW.
|
Comparisons between vaccine strain and respective vaccine breakthrough isolate MSA-2a/b molecules also demonstrated defined stretches of polymorphism. Similar stretches of sequence can be found in other Australian and American strains. This is exemplified in Fig. 3C, which shows a multiple sequence alignment for T-1, G06, and Mo7. While G06 MSA-2a/b has high identity to T-1 MSA-2a/b (84.9%), two segments of G06 (denoted in boxes in Fig. 3C) have nearly complete identity to Mo7 MSA-2a1. Similar examples have been observed among the Mo7 and R1A MSA-2a/b molecules. As has been previously proposed, these regions appear to be sites of genetic exchange (10).
Analysis of the hypervariable region. Comparison between vaccine strains and their respective breakthrough isolates indicates that the HVR contains the greatest amount of diversity in the molecule and that it varies in length. To determine the basis for this, we examined the amino acid and nucleotide sequences of the HVR from 22 MSA-2a/b proteins. The HVRs of MSA-2a/b proteins are extremely hydrophilic (data not shown) and are defined at the amino end by a completely conserved P201 and at the carboxy end by the predicted GPI anchor signal sequence cleavage site of SFT (Fig. 2). The length of this region varies widely from 34 (F28 MSA-2a/b) to 84 (Mo7 MSA-2a1) amino acids.
The HVR is rich in proline residues, especially in comparison to the remainder of the molecule (Fig. 4A). Sixteen percent of residues in the combined pool of HVR codons (443 codons total) are proline. They are irregularly spaced and have an intervening sequence length ranging from 0 to 13 residues. To understand why this part of the molecule is proline rich, manual alignment of representative HVRs, including the carboxy region of two msa-1- and two msa-2c-encoded proteins, was performed (Fig. 4B). These manual alignments identify three major, semiconserved, proline-containing motifs. The first motif of 11 residues (red sequence in Fig. 4B) has a consensus sequence of QGTTGTQ-[PQ]-SQD and is present at least once, in whole or in part, in all HVRs at a well-conserved position near the amino end of the HVR. The second motif of four residues (purple sequence in Fig. 4B) has a consensus sequence of PAAP and follows the first motif with a variable number of intervening amino acids. The third motif of nine residues (green sequence in Fig. 4B) has a consensus sequence of QPTKPAETP and is limited to the carboxy half of the HVR. It is present in whole or truncated form in all but one of the sequences and in all but one MSA-2a/b sequence is associated with a semiconserved nonrepeated segment with a consensus sequence of GNLNG (blue sequence in Fig. 4B).
![]() View larger version (38K): [in a new window] |
FIG. 4. The HVR of MSA-2a/b contains a series of three, proline-rich, repeating motifs. (A) Amino acid alignment of all 22 MSA-2a/b proteins examined. A period replaces all nonproline residues, a red "X" replaces all proline residues, and dashes indicate gaps in sequences. (B) Manual amino acid alignment of 10 representative MSA-2a/b HVRs along with the 3' region upstream of the GPI anchor signal sequence of F35, F40, and R1A MSA-1 and F40 and F28 MSA-2c. Alignments are based on both amino acid and nucleotide sequences. Motifs are highlighted by color. Sequences of the first motif are red, sequences of the second motif are purple, sequences of the third motif are green, and the common GNLNG sequence is blue. Residues that are not assigned to a motif are in black letters. Blank spaces indicate gaps. A black bisected underline splits the HVR into amino and carboxy regions as labeled.
|
![]() View larger version (40K): [in a new window] |
FIG. 5. The amino-terminal region of the HVR is encoded by a series of degenerate nucleotide repeats. (A) Nucleotide sequence alignment of representative segments from the amino-terminal region (Fig. 4B) of HVRs. Base pair substitutions in reference to the consensus sequence are italicized; base pairs identical with the consensus sequence are red. The encoded amino acid sequences of these segments, with their relative positions in the 5' region of the HVR, are aligned in the left half of the figure. The amino acid alignment is partitioned into three segments and numbered as indicated at the top of the alignment. "Md" indicates sequences derived from the middle portion of the amino half of the HVR region. Spaces denote gaps. (B) Sequences of the of representative HVRs with transformation of the amino acid sequence fragments of the 5' region with the segment number to which their nucleotide sequences mapped.
|
Comparison of the MSA-2a/b HVR to other VMSA family members. All VMSA family members share a well-conserved GPI anchor signal sequence. In contrast, this carboxy-terminal signal sequence is preceded by an HVR in MSA-2a/b that has very few overall similarities to other family members. There are several examples, however, where segments of the HVR have high identity to a corresponding sequence in other VMSA family members (Fig. 6).
![]() View larger version (22K): [in a new window] |
FIG. 6. The HVR has sequence similarities to other VMSA family members. Three separate amino acid alignments are shown. Sequences shared among VMSA family members are boldfaced. The bracketed segments highlight motifs 1 and 3 as specified in the text (Fig. 4). Dashes indicate sequence gaps. The fractions listed at the right of the sequences are the frequency with which the representative sequence segments are present in the indicated VMSA family members that were examined.
|
A second example is the sequence similarity and motif associations of R1A MSA-2a2 to MSA-2c. The third motif of R1A MSA-2a2, unlike all other MSA-2a/b proteins, is not associated with the GNLNG sequence (Fig. 4B). However, this arrangement of the third motif without the adjoining sequence is representative of nearly all MSA-2c proteins examined (9/ 10). Moreover, the sequence of R1A MSA-2a/b is identical over this stretch to F40 MSA-2c, including the eight remaining downstream residues of the HVR.
Lastly, the region just upstream of the GPI anchor cleavage site of F28 MSA-2c, unlike all other MSA-2c proteins, contains the GNLNG sequence present in the HVR of most MSA-2a/b proteins. This region, along with seven residues downstream, has 77% identity to Mo7 MSA-2a2 and shares four residues upstream with the truncated third motif in T2Bo MSA-2b (PPQT) (Fig. 4B). Collectively, these data suggest that genetic exchange may have occurred among all members of the VMSA family, including msa-2a/b, msa-2c, and msa-1 genes.
Analysis of MSA-2c. Analysis of the deduced amino acid sequences of msa-2c from all Australian and American strains is consistent with previous findings that these proteins are more highly conserved than MSA-2a/b across the entire molecule (10, 28). Nearly all proteins are 265 residues, and they collectively have 75% identity, with pairwise comparisons ranging from 100 to 85% identity. In contrast to the highly polymorphic MSA-2a/b proteins, the vaccine breakthrough MSA-2c proteins have 92% to 100% identity to the MSA-2c of their respective vaccine strain. F28 MSA-2c, however, is an exception to this group. The protein is 254 residues long as a result of three short deleted segments, with pairwise comparisons ranging from 53 to 55% identity to MSA-2c of all other strains and isolates. Interestingly, however, F28 MSA-2c also is closely related to BabR 0.8 (50% overall identity), another VMSA family member not present in the msa-2 locus (10), with 73% identity to the first 180 amino acid residues. All other MSA-2c proteins are more distantly related to BabR, with overall identities that range between 39% and 40%.
|
|
|---|
The only region of the predicted mature molecule that is semiconserved among vaccine strains and their breakthrough isolates, as well as in MSA-2a/b proteins of all other strains, is in the central region of the molecule containing the YYK motif. A similar motif (YFK), also in the central region of the molecule, is completely conserved in the MSA-1 proteins examined from Australia and America (12), suggesting that this part of the molecule in both MSA-1 and MSA-2a/b may have the same function. The tripeptide along with two semiconserved downstream residues, YYK-[KN]-[EH], is reminiscent of the binding motif (YY-[LIV]-[DN]-H) contained within WW modules, well-described motifs that bind proline-rich and phosphorylated targets (14). The regions containing these two motifs, intriguingly, are predicted to have similar secondary structures (data not shown).
The HVR is one of the most hydrophilic regions of the molecule and, interestingly, is proline rich as a result of three, frequently repeated, proline-containing, semiconserved motifs. HVR-specific monoclonal antibodies (11) bind to the surface of live merozoites (20), and monospecific antiserum directed against MSA-2a/b proteins blocks erythrocyte attachment and invasion (16). Taken together, these findings imply that the HVR may be involved with the initial interactions between merozoites and the erythrocyte surface during invasion. Several well-described Plasmodium surface proteins involved in cell invasion and the procyclins of Trypanosoma brucei also contain proline-rich regions (PRRs). Plasmodium vivax and Plasmodium knowlesi Duffy receptor proteins have central PRRs (9) that have recently been shown to contain the elements responsible for binding Duffy antigens (25), allowing invasion only in Duffy antigen-positive individuals. Duffy blood group antigens have also been described for cattle, and the ability of B. bovis to infect erythrocytes from different species of cattle positively correlates with the prevalence of Duffy antigen expression in different species (18). The central repeat region II of the circumsporozoite protein (CSP) of several Plasmodium spp. also constitutes a PRR (7, 22) that appears to function as a linker between two binding domains implicated in hepatocellular adhesion. The procyclins of T. brucei are GPI-anchored surface proteins expressed during the life cycle in the tsetse fly that, similar to the MSA-2a/b proteins, have PRRs abutting the GPI anchor signal sequence (29). Acosta-Serrano et al. postulate that the function of this region could be to protect the parasite during its passage through the midgut (1).
How and when MSA-2a/b proteins undergo variation are not clear. The amino acid changes within the body of the molecule range from numerous scattered substitutions and rare short deletions to large segments of sequence diversity that are shared in other MSA-2a/b proteins suggestive of genetic exchange. The most variable part of the molecule, though, is the HVR; it has no regions of complete conservation and contains large indels. Much of this length variation can be attributed to the presence of a series of variable numbers of degenerate repeats in the 5' end of the HVR. This is reminiscent of the central region II of the Plasmodium CSP, which is composed of 25 to 40 short proline-containing, tandemly arranged, repeating motifs (3). Slipped-strand mispairing, occurring during clonal expansion, has been implicated in generating the differences in the number of repeats between different clones of Plasmodium falciparum and Plasmodium reichenowi (21). The types of changes generated by this method have yet to be observed in the B. bovis VMSAs. As these HVR polymorphisms likely lead to changes in surface epitopes, a clearly useful application of this method of diversification, as well as gene conversion, would be during clonal expansion within the mammalian host. Finely targeted unequal crossover events within the HVR can also lead to a series of repeats in this region. As the sexual stages of B. bovis occur in Boophilus microplus, variation in the HVR produced by this method would occur during passage of the parasite through the tick vector. Studies designed to determine in what part of the life cycle and how these changes arise, as well as how these changes may affect escape from immune recognition, are currently under way.
Initial characterization of the msa-2 locus in the biological clone Mo7 demonstrated that the locus contained three msa-2a/b genes (-2a1, -2a2, and -2b) and one msa-2c gene (10). While another American strain, T2Bo, similarly has four msa-2 genes in the locus, the msa-2 locus from 12 Australian strains and isolates has only one msa-2a/b and one msa-2c gene. How the presence of only a solitary msa-2a/b gene in the Australian loci affects the generation of sequence diversity, compared to multigene loci such as Mo7 and T2Bo, is unknown. Several mechanisms of gene diversification, such as gene conversion and homologous recombination, depend a priori on pools of similar genetic material. Because the Australian msa-2 loci have only one msa-2a/b gene, the pool of genetic material available for these mechanisms is limited in comparison to multigene loci. Nevertheless, the ability of these isolates to persist in the mammalian host and in the population, coupled with their postulated role in host immune system evasion, indicates that mechanisms of generating diversity in these loci provide sufficient variability. While this could be accomplished by homologous recombination or gene conversion among msa-2a/b genes from different isolates, occurring in the tick vector and/or mammalian host, sequence similarities among MSA-2a/b, MSA-2c, and MSA-1 suggest that additional sources of genetic material available for genetic exchange may include msa-1 and msa-2c.
In summary, the msa-2 locus of all Australian B. bovis strains and isolates examined, unlike the American strains, contains only two msa-2 genes: one msa-2c and one msa-2a/b gene. Analysis of 22 msa-2a/b genes from 16 different American and Australian B. bovis strains and isolates indicates that these genes encode a diverse set of proteins. As hypothesized, the encoded MSA-2 proteins of Australian vaccine breakthrough strains are different from the MSA-2 proteins of their vaccine counterpart, suggesting that these proteins may be involved in strain-specific protective immunity. The greatest diversity among MSA-2a/b proteins is in a hypervariable region, upstream of the GPI anchor signal sequence, which arises from variable numbers of proline-rich, degenerate repeats, similar to the repeat regions of Plasmodium CSP and T. brucei procyclins. Whether changes in the HVR arise only during the sexual stages, occurring in the tick vector, or also during passage through the mammalian host is under investigation.
This work was supported by NIH K08 Award 1 K08 AI060630-01 and USDA SCA58-5348-2-683.
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»