ABSTRACT
Neisseria meningitidis causes meningococcal disease, often resulting in fulminant meningitis, sepsis, and death. Vaccination programs have been developed to prevent infection of this pathogen, but serogroup replacement is a problem. Capsular switching has been an important survival mechanism for N. meningitidis, allowing the organism to evolve in the present vaccine era. However, related mechanisms have not been completely elucidated. Genetic analysis of capsular switching between diverse serogroups would help further our understanding of this pathogen. In this study, we analyzed the genetic characteristics of the sequence type 7 (ST-7) serogroup X strain that was predicted to arise from ST-7 serogroup A at the genomic level. By comparing the genomic structures and sequences, ST-7 serogroup X was closest to ST-7 serogroup A, whereas eight probable recombination regions, including the capsular gene locus, were identified. This indicated that serogroup X originated from serogroup A by recombination leading to capsular switching. The recombination involved approximately 8,540 bp from the end of the ctrC gene to the middle of the galE gene. There were more recombination regions and strain-specific single-nucleotide polymorphisms in serogroup X than in serogroup A genomes. However, no specific gene was found for each serogroup except those in the capsule gene locus.
INTRODUCTION
Neisseria meningitidis is commonly regarded as an important life-threatening pathogen of public health. The capsule is an important marker of virulent N. meningitidis, although many capsulated isolates rarely or never cause invasive meningococcal disease (IMD) (1). The main constituent of the capsule, polysaccharide, is also the effective antigen of most licensed N. meningitidis vaccines (2).
Based on the structure of its capsule polysaccharide, N. meningitidis has been classified into 12 serogroups. The genes responsible for expression of capsule polysaccharide are located in a single locus. Specific genes or sequences in the locus determine the oligosaccharide components and their linkages that correspond to certain serogroups (3). Horizontal gene exchange involving the capsule gene locus can lead to capsular switching, which is an important mechanism for the organism to escape host immunity and drives the species to change in response to vaccine programs (4–7).
Although there is great importance attached to this organism's ability to horizontally exchange genes, resulting in capsule switching, the process is incompletely understood. The probability of capsular switching between different serogroups varies significantly, which impedes extensive analysis of the recombination process through comparison of the genetic characteristics between events by different serogroups. Most capsular switching events occurred between serogroups B, C, W, and Y, since their capsule polysaccharide is composed of sialic acid (5, 8–10). Rare capsular switching between serogroup A and other serogroups has been reported, although serogroup A historically has been epidemic for many years (11, 12), and the vaccines incorporating its antigens have been used frequently (13–15). The only study to assess capsular switching from serogroup A to another serogroup (serogroup C) confirmed the probability of capsular switching under natural conditions (16). As expected, capsular switching between serogroups A and C involved a long DNA fragment including multiple capsular genes. However, no further information was revealed because the analysis was limited to the capsule gene locus and not the larger region. Therefore, it was not possible to compare it to similar events. Recently, Pan and colleagues reported one clinical case where capsular switching from sequence type 7 (ST-7) serogroup A to X occurred (17). This ST-7 serogroup X strain represents another model where the crucial genetic basis of capsular switching related to N. meningitidis serogroup A strains can be studied. In this study, we analyzed the genetic characteristics of the ST-7 serogroup X strain at the genome level in order to clarify the relationship between N. meningitidis ST-7 serogroup X and A strains. Furthermore, we investigated the differences between ST-7 serogroup X and A, especially at the capsule gene locus. In addition to previous studies, our analysis will help to understand the recombination process that underpins capsule switching between serogroup A and other serogroups.
RESULTS
Genome structure of strain 331401.The complete genome sequence of strain 331401 consisted of one circular chromosome with a size of 2,191,116 bp, which was 3,096 bp larger than that of ST-7 serogroup A strain 510612. The average G+C content was 51.01% for the whole genome and 53.29% for gene regions. The chromosome was predicted to possess 2,452 genes, four rRNA operons, and 59 tRNAs, which was in accordance with or proximal to those for strain 510612.
Phylogenetic tree of genome sequences.Compared to the genome sequence of strain 8013 (serogroup C, ST-177), a total of 110,392 single-nucleotide polymorphisms (SNPs) were detected from 24 genomes that belonged to different clonal complexes (CCs). From the phylogenetic tree, serogroup A-related CCs, such as CC1, CC4, and CC5, formed a distinct cluster. Within this cluster, ST-7 strains independent of serogroup (serogroup A or X) shared higher similarity to each other than to the strains of other STs. Furthermore, four ST-7 strains from China (331401, 510612, 440529, and 310501) shared a batch of specific SNPs compared to other ST-7 strains. Strain 331401 had the highest similarity to strain 440529, which is an ST-7 serogroup A strain isolated in China (Fig. 1).
Phylogenetic tree of 25 N. meningitidis strains based on genomic sequences. The tree was constructed using the neighbor-joining method in MEGA 6.0 with 1,000 bootstrap iterations. The strain identification number, serogroup, sequence type, clonal complex (CC), isolation year, and country are shown for each strain. Genomic sequences of strains with darkened identification numbers were obtained in this study. ST-7 and ST-7 serogroup X strains are boxed and shaded, respectively.
Synteny and insertion/deletion mutations.Synteny analysis showed that strain 331401 had a coverage of strain 510612 of 99.63%, while there were many transposition events along the string of genomes (Fig. 2A). The sequences at transposition sites were analyzed and found to be repeat sequences, rRNAs, and insertion sequences. Compared with reference strain 510612, we found that strain 331401 contained 24 insertion mutations and 27 deletion mutations. Eleven insertion and six deletion mutations were located in the coding DNA sequence. Among these coding DNA sequences, three belonged to the function category of “replication, recombination and repair” and five were related to the biogenesis of the cell wall/membrane. The others had no functions in common with those in the nt/nr and Swiss-Prot databases.
Comparison of genome and capsular locus sequences between the N. meningitidis ST-7 serogroup X strain and other strains. (A) Synteny analysis between ST-7 serogroup X (331401) and A (510612) strains. (B) Capsular locus genes of ST-7 serogroup X/A and three other serogroup X strains. The yellow rectangle in the internal sequence of strain 420815 represents an 804-bp insertion; the second gap in the internal sequence of strain 331401 represents a 13-bp deletion. (C) Single-nucleotide polymorphisms around the breakpoints. The deduced breakpoints are boxed in red.
Recombination in ST-7 strains.The numbers of strain-specific SNPs ranged from 6 to 823 for each serogroup A strain and numbered 2,435 for strain 331401 (Table 1). Most of the SNPs were scattered throughout the genomes, and the fragments with high frequency of SNPs (≥5 SNPs/100 bp) were usually concentrated in several regions (Fig. 3). When we used 5 SNPs/100 bp as the cutoff, zero to six probable recombination regions (PRRs) were identified in each genome among the 12 ST-7 serogroup A strains; eight PRRs involving 1,422 SNPs were found in the genome of strain 331401. Consequently, the SNPs in PRRs accounted for 42.0% to 65.9% of all strain-specific SNPs in each genome. Previous studies determined that the per-site recombination/mutation estimate was 100:1 (18). Therefore, 5 SNPs/100 bp was a conservatively estimated value to determine whether a sequence was a PRR. One PRR of 331401, which was estimated to be 8,400 bp, contained the capsular locus sequence. The other seven PRRs, scattered throughout the genome, involved genes representing a range of categories of function. To identify probable donors, the sequences were searched in the nt/nr database of NCBI. The highest identity was related to non-serogroup A strains, although most sequences had no exact matches. For most PRRs, the sequences matched a single strain with significantly diverse identity at different regions or different regions matched different strains, indicating that a single PRR was generated from multiple recombination events.
Number of strain-specific SNPs in each N. meningitidis ST-7 genome
Frequency of strain-specific SNPs in N. meningitidis ST-7 strains along the genome sequence. Except for strain 331401, which belongs to serogroup X, the other 12 strains belong to serogroup A. One serogroup X (331401) and five serogroup A (130508, 440529, 2002-2, 310501, and 510612) strains were isolated in China; the others were isolated in Niger, the United States, Burkina Faso, and Algeria. The ordinate scale is the number of strain-specific SNPs per 100 bp; the horizontal scale is the genome position of strain 331401. The region that the arrow is pointing to is the capsule locus of strain 331401.
Capsular locus sequence and recombination breakpoints.Strain 331401 had an intact capsular locus (region D to D′) (3) that referred to serogroup X. The sequence from ctrC through galE was significantly different from or had relatively low similarity (<30% to 94.3%) to strain 510612, while those outside this region shared identical sequences between the two strains (Fig. 2B). When the corresponding sequences from serogroup X strains were compared to those of other strains belonging to different CCs, the similarity varied from gene to gene or from strain to strain. Nevertheless, the sequence between ctrC and galE of strain 331401 had higher similarity to serogroup X strains than to a serogroup A strain. The sequences upstream of ctrC or downstream of galE had higher similarity to serogroup A. There was an insertion sequence (IS1016) between the ctrA and csxA genes that has been identified as a special genetic characteristic of serogroup X (3).
The sequence characteristics suggested that the suspected recombination breakpoints were located in or near galE and ctrC, respectively. Furthermore, the distribution of SNPs indicated one breakpoint located in the last 12 bp of ctrC and the first 8 bp of ctrD (an overlap of 4 bp), whereas the other breakpoint was located between 688 and 697 bp of the galE gene (Fig. 2C). The sequences between the two breakpoints were approximately 8,540 bp for strain 510612 and 8,468 bp for strain 331401. Capsular locus sequences from three serogroup X strains belonging to different CCs were analyzed to define the probable donor. Among these, strain 420815 (an ST-5586 strain isolated in China) had a sequence almost identical to that of 331401, except for the internal sequence between ctrA and csxA. Compared to other serogroup X strains, strain 331401 had a 13-bp deletion and 420815 had an 804-bp insertion at this sequence. The sequences flanking the deduced breakpoints were quite different between strains 420815 and 331401.
Strain-specific genes of strain 331401.Throughout the genome, three specific genes were identified for strain 331401: IS1016, csxA, and csxB. Meanwhile, strain 331401 lacked four genes, sacA to sacD, that were common to serogroup A strains.
DISCUSSION
From the results obtained in this study, strain 331401 was most similar in terms of genetic characteristics to N. meningitidis ST-7 serogroup A, indicating its origination from the latter by capsular switching. The capsular switching was caused by recombination of several capsular locus genes. A similar event has been reported between ST-7 serogroups A and C, which involved an even longer capsular locus sequence (16). Both events suggest that N. meningitidis ST-7 serogroup A possesses sufficient ability to transform long DNA sequences. Nevertheless, significantly fewer capsular switching events occurred between serogroup A and other serogroups (16, 19) than among other serogroups (5, 10, 20). The specificity of serogroup A might be explained partially by its genetic structure of the capsular gene locus, which is distinct from those in other serogroups. However, there should be other factors that control capsular switching between serogroup A and other N. meningitidis serogroups, such as the imbalanced carriage rate between different serogroups. The N. meningitidis ST-7 serogroup A strain was isolated less occasionally from healthy carriers than with other serogroups that are commonly related to invasive strains (21). Our recent investigation of meningococcal carriage also showed that N. meningitidis serogroup A was rarely detected even in a population with high incidence (data not published).
High homology of recombination sequences in N. meningitidis ST-7 serogroup X genomes with non-serogroup A strains indicates significant horizontal gene transfer between serogroup A and other serogroups. Therefore, there are no fundamental barriers for exchange of genetic material. However, there was no obvious difference with the exception of the capsule gene locus concerning gene components between ST-7 serogroup X and A strains, although up to eight PRRs were identified in the genome of the serogroup X strain. Thus, the N. meningitidis ST-7 genome is quite stable while possessing strong transformation ability.
The N. meningitidis ST-7 serogroup X strain shared almost identical capsule locus sequences flanking the recombination region with serogroup A, which was significantly different from other serogroup X strains. This indicated that the flanking sequences did not significantly affect the expression of serogroup X polysaccharide or the determinants of capsule expression located in the recombination region of ctrC to galE genes, in which highly conserved csxA to csxC genes (Fig. 2B) are crucial. This is in accordance with a previous study where the replacement of the ctrA-galE capsular locus sequence in serogroup A with a corresponding serogroup B locus was sufficient to switch capsule expression (22). Unlike the region of general conservation of recombination, the interval sequence between the csxA and ctrA genes, represented by the insertion sequence IS1016, varied significantly. This should have little or no effect on capsular expression, which is in accordance with a previous study (23). Interestingly, there were also insertion sequences (IS1031) within recombination regions in other capsular switching events (8, 16). We postulate that the insertion sequences were related to the transfer of the capsule gene locus. The aforementioned study has also proposed the same theory (23). In the study on capsular switching from ST-7 serogroups A to C, one recombination site was also located in the ctrC gene and the other was in the rfbB gene situated next to galE. If this is not a coincidence, there is a probable explanation for this phenomenon: the ctrC and galE-rfbB genes are recombination hot spots. In fact, the galE gene was also demonstrated to be a recombination hot spot in both capsular switching and Neisseria species evolution studies (20, 24).
Consistent with previous studies, which found that recombination contributed more than point mutations to the evolution of N. meningitidis (18, 25), our analysis also identified a number of recombination events in the N. meningitidis ST-7 genomes. However, the serogroup X strain had considerably more strain-specific SNPs than serogroup A even if SNPs in the recombination regions were excluded, which suggests that serogroup X or its predecessor suffered an explosion or large accumulation of point mutations. In addition, the indication that each PRR was generated from multiple recombination events revealed that the N. meningitidis ST-7 serogroup X strain experienced a large number of recombination events. The reason why there was an explosion of SNPs and recombination events, and whether the explosion affected the rise of serogroup X, remains unknown.
MATERIALS AND METHODS
Meningococcal isolates and DNA preparation.The N. meningitidis strains were propagated on a single plate containing Columbia agar for 18 h at 37°C in 5% CO2. Genomic DNA was extracted using the Wizard Genomic DNA purification kit (Promega, Madison, WI, USA) according to the manufacturer's instructions.
Genome sequencing, assembly, and annotation.The genomes of one ST-7 serogroup X (331401) and four ST-7 serogroup A isolates were sequenced using Solexa sequencing by constructing two paired-end (PE) libraries with an average insertion length of 500 bp. Reads were generated using an Illumina HiSeq 2000 sequencing platform (Illumina, San Diego, CA) and assembled into contigs and scaffolds using SOAPdenovo (version 1.04) (26). Reads with a certain proportion of low-quality (Q of ≤20) bases (10% as default) were removed.
The genome of strain 331401 was also sequenced using PacBio. The long reads generated by PacBio RSII were filtered, self-corrected, and assembled into a draft genome using SMRT Analysis with default parameters (version v2.3.0) (27). The gaps in the draft genome from PacBio data were filled with contigs assembled by SOAPdenovo. Finally, the genome sequence was corrected with reads generated by HiSeq 2000 using the GATK tool (version 2.8.1) (28).
Genes were predicted using Glimmer (29) with default parameters and annotated by sequence comparisons with the nonredundant NCBI collection of nucleotide and protein sequence databases and the Swiss-Prot database using BLAST with an E value of 1e−5.
Phylogenetic analysis of genome sequences.A total of 25 N. meningitidis genomes were used to construct a phylogenetic tree. With the exception of the five genomes obtained in this study, the other 20 genomes were downloaded from the NCBI database. The phylogenetic tree of genome sequences was constructed with the following steps. (i) The single-nucleotide polymorphisms (SNPs) were searched for query sequence alignment using MUMmer (version 3.22) (30) with the following parameters: −b 200 −c 65 –extend −l 20. The SNPs located in repeat regions or near the base N (20 bp upstream or downstream) were filtered. (ii) All bases at the SNP sites in each genome were connected in the same order as the reference sequence and subjected to phylogenetic analysis. (iii) The phylogenetic tree was constructed with MEGA6 (31) using the neighbor-joining method with default parameters.
Synteny analysis.The genome sequence of strain 510612 (serogroup A, ST-7; GenBank accession number CP007524 ) was downloaded from NCBI and used as the reference in synteny analysis. The genome sequence fragments of strain 331401 were ordered according to that of the reference genome based on MUMmer. The upper and following axes of the linear synteny graph then were constructed after the same proportion of size reduction in length of both sequences. Based on the results of BLAST, each pair of nucleic acid sequence alignments was marked in the coordinate diagram according to their position information after the same proportion of size reduction.
Detection of insertion and deletion mutations.The reference and query sequences were aligned using LASTZ software (version 1.03.73) (32). Through a series of amendments using axt_correction, axtSort, and axtBest, the best alignment results were chosen and the insertion and deletion mutations were preliminarily obtained. A sequence of 150 bp upstream and downstream of the mutation site in the reference genome was extracted and then aligned with the query reads. The alignment results were verified with BWA (version 0.7.12) and SAMtools (version 1.2) (33).
Identification of recombination.Recombinations in the genome were identified as follows. (i) The SNPs in genomes of strain 331401 and 12 ST-7 serogroup A strains were searched as described for phylogenetic analysis of genome sequences. (ii) Strain-specific SNPs were extracted and the frequency of SNPs for 100-bp frameshifts was calculated for each genome. (iii) The sequences with high frequency of strain-specific SNPs were assumed to be the probable recombination region (PRR). (iv) The capsule locus sequences from serogroup X and A strains were compared using MUSCLE software (version 3.8.31). According to the SNP array, we deduced the probable recombination breakpoints of strain 331401.
Identification of strain-specific genes.The same panel of strains as that for identification of recombination was used, and strain-specific genes of 331401 were detected by the following steps. (i) Genes in each genome were predicted using Glimmer (29) with default parameters. (ii) Predicted genes were compared to those from reference strain 510612 using BLAST to determine the orthologs, and then probable strain-specific genes were extracted. (iii) The probable strain-specific genes were compared with the complete genome sequences of all other strains using BLAST with an E value of 1e−5 to validate specificity.
Accession number(s).The genome sequence obtained in this study was submitted to GenBank under the accession number CP012694 .
ACKNOWLEDGMENTS
This work was supported by grants from the National Key Program for Infectious Disease of China (2013ZX10004221), the State Key Laboratory of Infectious Disease Prevention and Control (2015SKLID502), and the National Natural Science Foundation of China (81602903).
FOOTNOTES
- Received 13 December 2016.
- Returned for modification 29 January 2017.
- Accepted 12 March 2017.
- Accepted manuscript posted online 20 March 2017.
- Copyright © 2017 American Society for Microbiology.