Skip to main content
  • ASM
    • Antimicrobial Agents and Chemotherapy
    • Applied and Environmental Microbiology
    • Clinical Microbiology Reviews
    • Clinical and Vaccine Immunology
    • EcoSal Plus
    • Eukaryotic Cell
    • Infection and Immunity
    • Journal of Bacteriology
    • Journal of Clinical Microbiology
    • Journal of Microbiology & Biology Education
    • Journal of Virology
    • mBio
    • Microbiology and Molecular Biology Reviews
    • Microbiology Resource Announcements
    • Microbiology Spectrum
    • Molecular and Cellular Biology
    • mSphere
    • mSystems
  • Log in
  • My alerts
  • My Cart

Main menu

  • Home
  • Articles
    • Current Issue
    • Accepted Manuscripts
    • Archive
    • Minireviews
  • For Authors
    • Submit a Manuscript
    • Scope
    • Editorial Policy
    • Submission, Review, & Publication Processes
    • Organization and Format
    • Errata, Author Corrections, Retractions
    • Illustrations and Tables
    • Nomenclature
    • Abbreviations and Conventions
    • Publication Fees
    • Ethics Resources and Policies
  • About the Journal
    • About IAI
    • Editor in Chief
    • Editorial Board
    • For Reviewers
    • For the Media
    • For Librarians
    • For Advertisers
    • Alerts
    • RSS
    • FAQ
  • Subscribe
    • Members
    • Institutions
  • ASM
    • Antimicrobial Agents and Chemotherapy
    • Applied and Environmental Microbiology
    • Clinical Microbiology Reviews
    • Clinical and Vaccine Immunology
    • EcoSal Plus
    • Eukaryotic Cell
    • Infection and Immunity
    • Journal of Bacteriology
    • Journal of Clinical Microbiology
    • Journal of Microbiology & Biology Education
    • Journal of Virology
    • mBio
    • Microbiology and Molecular Biology Reviews
    • Microbiology Resource Announcements
    • Microbiology Spectrum
    • Molecular and Cellular Biology
    • mSphere
    • mSystems

User menu

  • Log in
  • My alerts
  • My Cart

Search

  • Advanced search
Infection and Immunity
publisher-logosite-logo

Advanced Search

  • Home
  • Articles
    • Current Issue
    • Accepted Manuscripts
    • Archive
    • Minireviews
  • For Authors
    • Submit a Manuscript
    • Scope
    • Editorial Policy
    • Submission, Review, & Publication Processes
    • Organization and Format
    • Errata, Author Corrections, Retractions
    • Illustrations and Tables
    • Nomenclature
    • Abbreviations and Conventions
    • Publication Fees
    • Ethics Resources and Policies
  • About the Journal
    • About IAI
    • Editor in Chief
    • Editorial Board
    • For Reviewers
    • For the Media
    • For Librarians
    • For Advertisers
    • Alerts
    • RSS
    • FAQ
  • Subscribe
    • Members
    • Institutions
Molecular Genomics

Multilocus Sequence Typing of Streptococcus pyogenes and the Relationships between emm Type and Clone

Mark C. Enright, Brian G. Spratt, Awdhesh Kalia, John H. Cross, Debra E. Bessen
Mark C. Enright
Wellcome Trust Centre for the Epidemiology of Infectious Diseases, University of Oxford, Oxford, United Kingdom, and
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Brian G. Spratt
Wellcome Trust Centre for the Epidemiology of Infectious Diseases, University of Oxford, Oxford, United Kingdom, and
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Awdhesh Kalia
Department of Epidemiology and Public Health, Yale University School of Medicine, New Haven, Connecticut
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
John H. Cross
Department of Epidemiology and Public Health, Yale University School of Medicine, New Haven, Connecticut
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Debra E. Bessen
Department of Epidemiology and Public Health, Yale University School of Medicine, New Haven, Connecticut
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
DOI: 10.1128/IAI.69.4.2416-2427.2001
  • Article
  • Figures & Data
  • Info & Metrics
  • PDF
Loading

ABSTRACT

Multilocus sequence typing (MLST) is a tool that can be used to study the molecular epidemiology and population genetic structure of microorganisms. A MLST scheme was developed for Streptococcus pyogenes and the nucleotide sequences of internal fragments of seven selected housekeeping loci were obtained for 212 isolates. A total of 100 unique combinations of housekeeping alleles (allelic profiles) were identified. The MLST scheme was highly concordant with several other typing methods. The emm type, corresponding to a locus that is subject to host immune selection, was determined for each isolate; of the >150 distinct emm types identified to date, 78 are represented in this report. For a given emmtype, the majority of isolates shared five or more of the seven housekeeping alleles. Stable associations between emm type and MLST were documented by comparing isolates obtained decades apart and/or from different continents. For the 33 emm types for which more than one isolate was examined, only five emmtypes were present on widely divergent backgrounds, differing at four or more of the housekeeping loci. The findings indicate that the majority of emm types examined define clones or clonal complexes. In addition, an MLST database is made accessible to investigators who seek to characterize other isolates of this species via the internet (http://www.mlst.net ).

Group A streptococci (GAS;Streptococcus pyogenes) are highly prevalent bacterial pathogens, having a worldwide distribution, whereby humans serve as their primary biological host. Most often, GAS infect superficial tissue sites, involving the mucosal epithelium of the upper respiratory tract (URT) or the epidermal layer of the skin, leading to pharyngitis or impetigo, respectively. On rare occasions, a GAS infection can lead to invasive disease that includes cellulitis, bacteremia, necrotizing fasciitis, and toxic shock syndrome, which can be life-threatening conditions. In addition, GAS contribute to morbidity through delayed nonsuppurative sequelae, such as rheumatic fever and acute glomerulonephritis.

The M and M-like proteins of GAS form surface fibrils that provide the basis for a widely used serological typing scheme. For many molecules studied in detail, the M serotype (M type) is usually defined by antigenic target sites contained within the distal, amino-terminal ends of these fibrillar proteins, and >80 distinct M types have been identified. M and M-like proteins are also key virulence factors, and protective immunity against GAS infection is type specific (8, 23). More recently, a genotypic typing scheme based on theemm genes that encode M and M-like proteins has become widely used and >150 different emm types have been characterized (15;http://www.cdc.gov/ncidod/biotech/strep/strains.html ). The antigenic heterogeneity exhibited by this family of proteins reflects the strong impact of host immunity on the generation of diversity within this bacterial species.

Numerous other genotypic methods have been developed for the typing of GAS isolates. Vir-typing measures restriction fragment length polymorphisms within the emm chromosomal region (18). Pulsed-field gel electrophoresis and arbitrary-primed PCR can provide high levels of resolution between strains by measuring multiple loci for differences that are not necessarily under selection (10, 17, 18, 33). Another important tool for discrimination among strains of GAS is multilocus enzyme electrophoresis (MLEE), which indexes differences in the net charge of housekeeping enzymes resulting from certain mutations (29, 32).

Multilocus sequence typing (MLST) is a nucleotide sequence-based method that is well suited towards characterizing the genetic relationships between the organisms of a bacterial species (12-14, 26). Because it is based on nucleotide sequence, it provides unambiguous results and is easily portable from lab to lab. Housekeeping loci are chosen for analysis because they are present in every organism (i.e., their products serve a vital function), and mutations within them are largely assumed to be selectively neutral (32). Clones, defined as isolates that are descendants of a recent common ancestor, can be identified as having shared alleles at each of the housekeeping loci. In this report, an MLST scheme using seven housekeeping loci was used to evaluate >200 GAS isolates that were derived from several continents, spanning a time period of >50 years and representing 78 distinct emm types.

MATERIALS AND METHODS

Bacterial strains.The GAS isolates of the MGAS series were kindly provided by Susan Hollingshead (University of Alabama at Birmingham), who had received them from James Musser, and the isolates have been previously described in detail (29, 30). The GAS isolates of the CT98 series were kindly provided by James Hadler and Nancy Barrett (State of Connecticut Department of Public Health, Hartford). Strain 700294 was purchased from the American Tissue Culture Collection (Manassas, Va.). All other GAS isolates have been previously described (6, 17).

Multilocus sequence typing.Chromosomal DNA was prepared from freshly grown GAS by previously described methods (6). Internal fragments of the glucose kinase (gki), glutamine transporter protein (gtr), glutamate racemase (murI), DNA mismatch repair protein (mutS), transketolase (recP), xanthine phosphoribosyl transferase (xpt), and acetyl coenzyme A (acetyl-CoA) acetyltransferase (yqiL) genes were amplified by PCR using the following primer pairs: gki-up, 5′-GGC ATT GGA ATG GGA TCA CC-3′, and gki-dn, 5′-TCT CCT GCT GCT GAC AC-3′; gtr-up, 5′-GAG GTT GTG GTG ATT ATT GG-3′, and gtr-dn, 5′-GCA AAG CCC ATT TCA TGA GTC-3′; murI-up, 5′-TGC TGA CTC AAA ATG TTA AAA TGA TTG-3′, and murI-dn, 5′-GAT GAT AAT TCA CCG TTA ATG TCA AAA TAG-3′; mutS-up, 5′-GAA GAG TCA TCT AGT TTA GAA TAC GAT-3′, and mutS-dn, 5′-AGA GAG TTG TCA CTT GCG CGT TTG ATT GCT-3′; recP-up, 5′-GCA AAT TCT GGA CAC CCA GG-3′, and recP-dn, 5′-CTT TCA CAA GGA TAT GTT GCC-3′; xpt-up, 5′-TTA CTT GAA GAA CGC ATC TTA-3′, and xpt-dn, 5′-ATG AGG TCA CTT CAA TGC CC -3′; yqiL-up, 5′-TGC AAC AGT ATG GAC TGA CCA GAG AAC AAG ATG C-3′, andyqiL-dn, 5′-CAA GGT CTC GTG AAA CCG CTA AAG CCT GAG-3′. The PCRs were performed in volumes of 50 μl, with an initial denaturation at 95°C for 4 to 5 min, followed by 28 cycles of 95°C for 1 min, 55°C for 1 min, and 72°C for 1 min. The amplified DNA fragments were purified either by precipitation with polyethylene glycol or using a PCR purification kit (Qiagen, Valencia, Calif.). The sequence of each fragment was obtained on both strands by using the same primers as those in the initial PCR amplifications and an AB1377 or AB13700 DNA sequencer (Perkin-Elmer Applied Biosystems, Foster City, Calif.).

For each locus, every different sequence was assigned a distinct allele number, and each isolate was defined by a series of seven integers (the allelic profile) corresponding to the alleles at the seven loci, in the order (alphabetical) of gki, gtr, murI, mutS, recP, xpt, andyqiL. Isolates with an identical allelic profile were assigned to the same sequence type (ST).

emm sequence typing. emm sequence typing is based on the 5′ end of the central emm gene within the emm chromosomal region (for map, see references5 and 6). A unique emm type is defined as having <95% sequence identity to any other knownemm type over 160 bp near the 5′ end, as specified (http://www.cdc.gov/ncidod/biotech/strep/strains.html ). There is a very strong correspondence between M type, as determined by serology, and the emm type that meets the stated definition (3, 15). In addition to a sequence identity of ≥95%, indels of four or fewer codons and/or frameshift mutations relative to the reference emm typing strain are allowed for classification as an established emm type. Until validation is complete, new emm types are assigned the nomenclature “emmst,” which stands for emm sequence type (15) and is not to be confused with “ST,” which refers to the MLST allelic profile.

Computations.A matrix of pair-wise differences in allelic profiles was constructed, and the similarities between the allelic profiles of the isolates were assessed by cluster analysis using the unweighted pair-group method with arithmetic averages (UPGMA) and the percent disagreement distance measure (Statistica version 5.5; StatSoft, Tulsa, Okla.). The maximum percent nucleotide divergence and average percent nucleotide divergence between pairs of alleles at a given locus were calculated using Mega version 2.0 (http://www.megasoftware.net ). The Index of Association (27) was used to test for linkage disequilibrium between alleles at the seven housekeeping loci. The observed variance in the distribution of allelic mismatches in all pair-wise comparisons of the allelic profiles was compared to that expected in a freely recombining population (linkage equilibrium). The significance of the difference in the observed and expected variance was evaluated by computing the maximum variance in the distribution of allelic mismatches obtained using 100 randomizations of the data set. Significant linkage disequilibrium was established if the observed variance obtained with the actual data was greater than that found with any of the 100 randomized data sets; otherwise, there was no evidence of a departure from linkage equilibrium.

RESULTS

Housekeeping loci used for MLST.Seven housekeeping loci were chosen for the characterization of GAS isolates by MLST and for determining their population genetic structure (Table1). The nucleotide sequence was determined for an internal portion of about 400 to 500 bp of each gene. The loci that were chosen had been used successfully for pneumococci (14) or were selected with guidance by data from the University of Oklahoma GAS genome sequencing project that is available on the World Wide Web. Large contigs from the database (www.genome.ou.edu ) were used in BLASTX searches against the GenBank database. Housekeeping loci were identified based on their putative function. Loci selected for this study were devoid of flanking regions containing genes that are likely to be under selection for variation (e.g., genes encoding cell surface proteins that may be under diversifying selection from the host immune response). The only possible exception was recP, positioned ∼9 kb from a putative penicillin-binding protein gene (pbp2x homologue). However, analysis of a set of 14 isolates showed nucleotide sequence divergence of <1.0% for an internal portion of pbp2x and a lack of evidence for interspecies recombinational events, as has been observed for pneumococcal and meningococcal pbp genes (11) (data not shown). Furthermore, GAS isolates that are resistant to penicillin have not been described as occurring in nature. Ten housekeeping loci were initially examined in a small subset of strains and the least and most polymorphic ones were discarded. The chromosomal distance between any two loci, calculated on the basis of the tentative genome map of strain 700294, ranges from 20 to 600 kb (www.genome.ou.edu ); it is possible that for other strains, the genomic location of the loci under study may differ.

View this table:
  • View inline
  • View popup
Table 1.

Housekeeping loci under studya

The number of unique alleles identified for each of the seven housekeeping loci ranged from 21 (for mutS) to 35 (forrecP) (Table 1). The maximum percent nucleotide sequence divergence between the alleles of a given locus ranged from 1.4% (foryqiL and murI) to 6.1% (for recP). For one housekeeping locus, recP, there were four widely divergent alleles (recP7, recP15, recP21, recP29) which may have arisen by importation of homologous regions from closely related species. As noted above, the recP gene is ∼9 kb from apbp2x gene; however, pbp2x alleles display low levels of polymorphism, and there were no obvious differences between the pbp2x alleles of isolates recovered in the pre-antibiotic era (early 1940s) and those obtained in recent decades (data not shown). The sequence was determined for part of thepbp2x gene of an isolate containing one of the divergedrecP alleles (recP7); this strain (C135) possessed the most prevalent pbp2x allele, and there is no evidence that the increased divergence of some recP alleles is due to hitchhiking driven by selection for interspecies recombination at the pbp2x locus. A more complete analysis of the housekeeping alleles is presented elsewhere (16; A. Kalia, M. C. Enright, B. G. Spratt, and D. E. Bessen, submitted for publication).

MLST of the GAS population.The collection of 212 GAS isolates (Table2) was assembled with several goals in mind. First, a genetically diverse group of GAS strains was desired. As will be shown in this report,emm type is a sensitive measure of genetic diversity. Of the >150 emm types characterized to date (http://www.cdc.gov/ncidod/biotech/strep/strains.html ), isolates representing 78 emm types were included in the MLST analysis. Secondly, it was of interest to evaluate GAS with large temporal and/or spatial distances between their isolation from human tissue, in order to assess the stability of clones. In addition, the selected GAS isolates were recovered in association with a variety of host tissues and diseases, including deep soft tissue infections. Finally, several GAS that had been previously analyzed using different molecular typing schemes were chosen for comparison to MLST, in order to provide validation of the new method.

View this table:
  • View inline
  • View popup
Table 2.

MLST of 212 GAS isolatesa

The sequences of the seven loci were determined for each of the 212 GAS isolates, and their allelic profiles were assigned. One hundred different allelic profiles were found, corresponding to ST1 through ST100. Sixty-six of the 100 STs were represented by only a single isolate; the number of isolates assigned to the other STs ranged from 2 to 16.

The average number of alleles per locus was 28.1, and therefore, the GAS MLST scheme is able to distinguish >13 billion different allelic profiles. An isolate with the most common allele at each of the seven loci is expected to occur, by chance, at a frequency of 7.5 × 10−5 (no isolates with this allelic profile were found among the 212 strains); most allelic profiles will occur by chance at much lower frequencies. Thus, it is extremely unlikely that two unrelated GAS isolates will have the same allelic profile.

Relationships between emm type and MLST.A matrix of pair-wise differences in allelic profiles was determined, and a dendrogram displaying the genetic linkage distance between the 212 isolates was constructed by cluster analysis using UPGMA (Fig. 1). In the dendrogram presented in Fig. 1, the 15 STs that are represented by four or more isolates are depicted. In 13 of these STs, all isolates are of a singularemm type. Is was of interest to further ascertain the strength of the associations between emm types and ST among GAS. Or, in other words, how well does emm type equate to clone?

Fig. 1.
  • Open in new tab
  • Download powerpoint
Fig. 1.

Dendrogram showing UPGMA cluster analysis of 212 GAS isolates. Bars to the left show allelic profiles (STs) represented by four or more isolates. Codes for strain designations at branch tips are listed in Table 2. Filled circles (n = 28) mark branches in which multiple descendants are all represented by a singleemm type. Open circles (n = 3) mark branches containing isolates with different emm types but sharing identical allelic profiles.

For analysis of the relationships between emm type and MLST, selection criteria for GAS isolates were set to minimize the inclusion of epidemiologically related clones. Therefore, our analysis was specifically intended to provide a conservative estimate of the strength of the association between emm type and allelic profile. Multiple isolates of the same emm type and ST combination were included in the analysis only if they were recovered from subjects located on different continents or isolated >1 year apart within the same continent. Also, at least one representative of all unique emm type-ST combinations were included.emm types represented by four or more isolates satisfying the above-stated epidemiologic criteria ( n = 15emm types and 81 isolates in total) were assessed for the genetic distances between all possible pair-wise comparisons of alleles of the seven housekeeping loci (Table 3). This provides a measure of the genetic diversity at multiple loci within a set of epidemiologically unrelated organisms that share anemm type.

View this table:
  • View inline
  • View popup
Table 3.

Pair-wise comparisons of housekeeping alleles among isolates of the same emm type

For six of the 15 emm types assessed (emm2, emm5, emm6, emm12, emm18, emm33), representing a total of 30 isolates, all isolates within an emm type displayed identical allelic profiles and can be regarded as clones (Table 3). Identical allelic profiles were observed for some organisms isolated >50 years apart (Table 2), indicating that GAS clones can be stable over this prolonged time period. One emm type (emm19) had isolates differing at one locus only, whereas two emm types had isolates differing at two loci (emm3, emm89). Isolates differing at two or fewer housekeeping loci (out of seven) can be regarded as clones or clonal complexes (16).

For epidemiologically distant organisms, as defined above, that were represented by only two or three isolates of the same emmtype ( n = 18 emm types), 11 emmtypes had identical allelic profiles, whereas five emm types differed at only one or two of the seven loci (Table 2). Although in some instances the sample size was small, emm type appears to closely correlate with clone or clonal complex for the majority (25 out of 33, or 76%) of emm types studied.

For several emm types represented by four or more epidemiologically distant isolates, there was a higher degree of genetic diversity. For three emm types—emm4, emm11, and emm49—pair-wise comparisons showed differences among three of the seven housekeeping loci (Table 3). An additional three emm types displayed differences at five or more of the housekeeping loci: emm1, emm44/61, andemm77 (also known as emm27L/77). Perhaps it is of biological relevance that isolates of two of the emm types (emm44/61 and emm77) were recently reported to be found in association with more that one sof allele, which provides the basis for a second major serological typing scheme for GAS (4). For emm1 isolates, pair-wise comparisons indicated that this group is the most genetically diverse (Table 3). However, of the nine epidemiologically distant isolates evaluated, eight differed from one another at three or fewer of the seven loci (Table 2); furthermore, the emm1 isolates cluster together, and there is a single node on the dendrogram from which all but one of the 23 emm1 isolates descend (Fig. 1). One emm1isolate (MGAS2110; ST91) differs from the other emm1isolates at six or seven of the seven housekeeping loci. In addition to the emm1, emm44/61, and emm77 isolates, the only other examples found for a single emm type on widely divergent genetic backgrounds are emm91 andemm93, whereby two isolates of each type differ at three and five of the seven housekeeping loci, respectively (Table 2).

The genetic distances within an emm type can be compared to the genetic distance between the 100 different STs identified. By definition, none of the isolates representing each of the 100 unique STs shared alleles at all seven of the housekeeping loci. Whereas the majority of epidemiologically distant isolates within an emmtype differed at two or fewer loci, 95% of the distinct allelic profiles (i.e., ST1 through ST100) differed from each other at five or more loci (Table 3). Furthermore, nearly half of the 4,950 possible pair-wise comparisons among the 100 STs differed at all seven housekeeping loci. Thus, comparisons between individual GAS clones most often reveal large genetic distances, contrasting sharply with the similar genotypes typically found within an emm type.

There were several examples of isolates with identical allelic profiles that differed in emm type: emm86 andemmstD626 (ST9), emm53 and emmstNS5(ST11), and emm19, emm29, and emmstRP31 (ST65) (Fig. 1). It is extremely unlikely that these examples of multipleemm types within a clone are due to a lack of discrimination of the GAS MLST scheme. For example, a single isolate with the allelic profile of ST65 was expected to occur by chance in the data set at a frequency of 2.2 × 10−8, and the likelihood of unrelated emm19, emm29, and emmstRP31 isolates having this allelic profile is essentially zero.

One emm type present on two or more genetically distant backgrounds, or multiple emm types present on a single genetic background, may have arisen as a consequence of the lateral movement of emm genes between different GAS strains. In GAS, generalized transduction by bacteriophage is the most probable mechanism for horizontal gene transfer.

Levels of linkage disequilibrium within the GAS population.The extent of recombination within the GAS population was assessed by the Index of Association (27). Using one isolate of each of the 100 STs, there was significant linkage disequilibrium between the alleles at each of the seven housekeeping loci. However, in populations in which recombination is sufficient to randomize the alleles at different loci over a longer term, the recent expansion of clones can result in the appearance of multiple isolates with similar genotypes (27). Therefore, the Index of Association was recalculated using one isolate of each of the 72 STs obtained by truncating the dendrogram (Fig. 1) at a genetic distance of 0.3; no significant linkage disequilibrium between alleles was observed. The truncation effectively reduced each clonal complex to a single representative strain and thereby diminished any bias introduced by the oversampling of select emm types.

Comparison of MLST to other typing methods.The high degree of concordance between ST and emm type provides strong evidence that the MLST typing scheme leads to accurate identification of clones or clonal complexes. The MLST scheme can be further validated by comparison to other typing methods. Isolates that had been previously assessed by MLEE, as reported by others (22, 29, 30), were compared for emm type, ST, and electrophoretic type (ET) (Table 4). For organisms represented by one or more isolates of the same emm type-ST combination, 20 were also concordant for ET, whereas 9 were discordant with ET; however, for the discordant ETs, several were genetically close in their relationship. For organisms represented by one or more isolates of the same emm type-ET combination, 20 out of 21 were also concordant for ST.

View this table:
  • View inline
  • View popup
Table 4.

Comparison of MLST to other typing methods

Arbitrary-primed PCR, yielding random amplified polymorphic DNA (RAPD) profiles, has been previously conducted on another subset of the GAS isolates reported here (17). For organisms represented by one or more isolates of the same emm type-ST combination, nine also had concordant RAPD profiles, whereas seven displayed distinct RAPD profiles (Table 4). However, for organisms represented by one or more isolates of the same emm type-RAPD profile combination, 9 out of 10 were also concordant for ST.

Although the level of strain resolution differs for emmtyping, MLEE, and RAPD analysis, each method displays high levels of concordance with the new MLST scheme.

GAS causing invasive disease.A total of 84 GAS isolates associated with invasive disease in the United States between 1986 and 1999 were included in this study. Thirty distinct emm types were represented by 34 unique allelic profiles (Fig.2). Among the subset of invasive disease isolates, there was a high one-to-one correspondence betweenemm type and ST. However, for the vast majority of pair-wise comparisons between invasive disease isolates of differentemm types, there were differences at four or more loci. Therefore, invasive disease caused by GAS can be attributed to a large number of genetically diverse strains or clones, confirming other reports (2, 17, 29, 35). However, two major clusters of isolates with identical or very similar allelic profiles were identified. These two clusters contained isolates of emm1and emm3, which are the emm types most commonly recovered from invasive disease in the United States during the 1990s (2, 17, 35).

Fig. 2.
  • Open in new tab
  • Download powerpoint
Fig. 2.

Dendrogram of invasive isolates from the United States (1986 to 1999). UPGMA cluster analysis of all 84 isolates derived from normally sterile tissue sites, as listed in Table 2, is shown. The nomenclature at the branch tips indicate emm type, followed by the two-letter abbreviation of the state of origin and a unique isolate number (where necessary).

DISCUSSION

A primary objective of this report is to provide the foundation for a new typing scheme for GAS that can be readily expanded upon by other investigators. In general terms, the value of molecular typing schemes lies in their ability to discriminate between the various strains within a bacterial species. However, high levels of discrimination are often achieved by indexing variation that accumulates very rapidly, making it difficult to demonstrate the relatedness of isolates that have diversified from a common ancestor that existed many decades ago. Variation within the nucleotide sequences of housekeeping genes accumulates relatively slowly, and as demonstrated in this report, isolates with the same allelic profile can be recovered many decades apart. Although the genetic variation indexed by MLST accumulates slowly, the multilocus approach allows for a vast number of distinct genotypes to be distinguished. Furthermore, MLST has high resolving power and, in many instances, it can discriminate among isolates of a single emm type.

The clustering of isolates achieved by MLST was in good agreement with those obtained using other typing procedures, and thus, the GAS MLST scheme provides a validated method for the unambiguous identification of GAS isolates. Since it is based on nucleotide sequence data, MLST allows different laboratories to compare their results via the internet. A website containing an initial database of the allelic profiles and molecular properties of the 212 GAS isolates and associated epidemiological data, together with interrogation and analysis software, is available (http://www.mlst.net ).

The organisms initially selected for analysis by MLST represented a total of 78 emm types, and their isolation from human subjects dates back nearly 60 years. A future goal is to apply the MLST scheme to at least one isolate of every known emm type, collected from worldwide sources. A thorough documentation of existing GAS clones will lay the groundwork for gaining a better understanding of the epidemiological trends underlying GAS disease and aid in deciphering the molecular basis for biological diversity within this species.

emm type provides the basis for a serological typing scheme that differentiates between antigenic epitopes contained within the amino-terminal, distal region of M-protein surface fibrils. Serum immunoglobulin G directed to M-type-specific epitopes leads to protective immunity for most strains that have been studied (1, 9, 25). Furthermore, the M proteins are key virulence factors, displaying a wide array of functional activities that act to promote disease (8). Unlike the housekeeping loci, emmgenes are highly variable as a consequence of diversifying selection applied by the host immune response. It might therefore be expected that emm type would change more rapidly than alleles at housekeeping loci, resulting in variation within emm type among isolates of a clone or clonal complex. However, emmtype is not defined by a unique nucleotide sequence but by ≥95% sequence identity. Consequently, descendants of an ancestral strain may accumulate as many as eight nucleotide changes (and small indels or frameshifts) within the 160-bp sequenced region of the emmgene without altering the emm type, whereas even a single nucleotide change in the ∼450 bp sequenced regions of any of the seven housekeeping loci results in a change in allelic profile. There are a few examples of isolates with identical allelic profiles having different emm types, such as ST65, which includes isolates of emm19, emm29, and emmst1RP31. Presumably, in these isolates, recombinational exchanges have resulted in the replacement of the region of the emm gene that definesemm type with the corresponding region from isolates of different emm types, since their divergence inemm type far exceeds 5%. Another multilocus typing method—MLEE—has also uncovered examples of isolates of the same genotype having different emm types (24, 30, 34).

A striking finding of this report is the degree to which multiple isolates of a given emm type share identical or highly similar allelic profiles (Table 3). Isolates of these emmtypes are considered to be clones or to form a clonal complex consisting of isolates with closely related allelic profiles. A much more extensive sampling of the GAS population will confirm the validity of this concept. The finding of a high one-to-one correspondence between emm type and clones or clonal complex suggests that GAS clones typically emerge and begin to diversify without changing their ancestral emm type. Recent studies using statistical tests of congruence between different housekeeping loci have indicated that recombination may be relatively common in GAS (16). This view was also supported by the lack of significant linkage disequilibrium between alleles that was observed when multiple isolates with similar genotypes were removed from the MLST data set, as measured by the Index of Association (27). Given this evidence for a major impact of recombination in the evolution of GAS populations, it is surprising that horizontal gene transfer appears to have rarely resulted in the presence of the same emm type in distantly related lineages. There are examples of this phenomenon, but they are uncommon. For example, among emm1 isolates (the most intensively sampled emm type), 22 of the 23 isolates form a cluster of lineages that all descend from the same relatively deep node (genetic distance of 0.5), whereas the other emm1 isolate differed from the former emm1 isolates at six or seven of the seven loci (Fig. 1; Table 2) (30).

MLST studies of Streptococcus pneumoniae have also shown that isolates with identical or closely related allelic profiles almost invariably have the same serotype. However, in contrast to the findings on GAS, there are often multiple examples of distantly related clones or clonal complexes sharing the same pneumococcal serotype. The paucity of distantly related GAS lineages sharing the same emm type may reflect differences in the strength of the immune response against pneumococcal capsular polysaccharides compared to that against M proteins, leading to differences in the strength of competitive exclusion between clones with the same capsular serotype oremm type. However, it might also be explained by the likelihood that changes in GAS serotype (i.e., emm type) occur by both mutation and recombination, whereas recombination involving the capsular biosynthetic genes is the only known mechanism underlying serotype changes in pneumococci (7). In the presence of strong selective immunological pressures, the diversification of emm genes might be further promoted by highly mutable processes, such as frameshift mutation and DNA slipped-strand mispairing (21, 28, 31). Unless recombinational exchanges that result in the presence of the sameemm type in different lineages have occurred relatively recently, the diversifying selection applied by the host immune system is likely to result in the divergence of the emm types of the parental and recipient lineages. Thus, descendants of ancient horizontal genetic transfer events that distributed a particular pneumococcal capsular locus into multiple lineages may have retained the same serotype, whereas it is far less likely that the descendants of a similar ancient horizontal distribution of an emm gene will have retained the original emm type. The different extent to which the same capsular or M type is found in different lineages of pneumococci or GAS may rest more on the ease with which serotypes can change in these species, rather than differences in the rates of horizontal gene transfer.

The GAS MLST scheme provides a new and unambiguous method for characterizing GAS isolates for epidemiological purposes by using the internet. The MLST data can be used to address several epidemiological issues concerning GAS disease. Changes in epidemiological trends can be more readily ascribed to the emergence of new clones. Vaccine design strategies can be further refined, and vaccine efficacy can be measured with greater precision. The sequences of fragments of seven housekeeping genes from hundreds of GAS isolates provide data that can be used to address aspects of the population and evolutionary biology of the species. For example, the ancestral relationships and patterns of descent among closely related isolates can be deduced, although relationships between more distantly related isolates are likely to be obscured by a history of recombination (16). The population genetic structure of GAS, based on neutral housekeeping loci, will provide a framework upon which to measure the distribution of adaptive loci. This, in turn, should provide new insights into the molecular basis for biological diversity among GAS, as well as the role of cell surface antigens in structuring the population (19, 20).

ACKNOWLEDGMENTS

We thank Yury Nunez, Eric Peterson, and Michelle Benitez for expert technical assistance, Susan Hollingshead (UAB) for supplying the MGAS strains, and Jim Hadler and Nancy Barrett (CT DOH) for providing the invasive isolates collected in Connecticut during 1998 (CT98 series) and the emm-typing data. We also acknowledge the Streptococcal Genome Sequencing Project funded by USPHS/NIH grant AI-38406 and the work performed by B. A. Roe, S. P. Linn, L. Song, X. Yuan, S. Clifton, R. E. McLaughlin, M. McShan, and J. Ferretti.

This work was supported by grants from the Wellcome Trust (to B.G.S.), the National Institutes of Health (AI-28944 to D.E.B. and GM-60793 to D.E.B. and B.G.S.), the American Heart Association (grant-in-aid to D.E.B.), and a Brown-Coxe Postdoctoral Fellowship (to A.K.). M.C.E. is a Royal Society University Research Fellow. D.E.B. is an Established Investigator of the American Heart Association.

Notes

Editor: E. I. Tuomanen

FOOTNOTES

    • Received 21 November 2000.
    • Returned for modification 4 January 2001.
    • Accepted 24 January 2001.
  • Copyright © 2001 American Society for Microbiology

REFERENCES

  1. 1.↵
    1. Beachey E. H.,
    2. Seyer J. M.,
    3. Dale J. B.,
    4. Simpson W. A.,
    5. Kang A. H.
    Type-specific protective immunity evoked by synthetic peptide of Streptococcus pyogenes M protein.Nature2921981457459
    OpenUrlCrossRefPubMedWeb of Science
  2. 2.↵
    1. Beall B.,
    2. Facklam R.,
    3. Hoenes T.,
    4. Schwartz B.
    Survey of emm sequences and T-antigen types from systemic Streptococcus pyogenes infection isolates collected in San Francisco, California; Atlanta, Georgia; and Connecticut in 1994 and 1995.J. Clin. Microbiol.35199712311235
    OpenUrlAbstract/FREE Full Text
  3. 3.↵
    1. Beall B.,
    2. Facklam R.,
    3. Thompson T.
    Sequencing emm-specific PCR products for routine and accurate typing of group A streptococci.J. Clin. Microbiol.341996953958
    OpenUrlAbstract/FREE Full Text
  4. 4.↵
    1. Beall B.,
    2. Gherardi G.,
    3. Lovgren M.,
    4. Forwick B.,
    5. Facklam R.,
    6. Tyrrell G.
    Emm and sof gene sequence variation in relation to serological typing of opacity factor positive group A streptococci.Microbiology146200011951209
    OpenUrlCrossRefPubMedWeb of Science
  5. 5.↵
    1. Bessen D. E.,
    2. Carapetis J. R.,
    3. Beall B.,
    4. Katz R.,
    5. Hibble M.,
    6. Currie B. J.,
    7. Collingridge T.,
    8. Izzo M. W.,
    9. Scaramuzzino D. A.,
    10. Sriprakash K. S.
    Contrasting molecular epidemiology of group A streptococci causing tropical and non-tropical infections of the skin and throat.J. Infect. Dis.182200011091116
    OpenUrlCrossRefPubMedWeb of Science
  6. 6.↵
    1. Bessen D. E.,
    2. Izzo M. W.,
    3. Fiorentino T. R.,
    4. Caringal R. M.,
    5. Hollingshead S. K.,
    6. Beall B.
    Genetic linkage of exotoxin alleles and emm gene markers for tissue tropism in group A streptococci.J. Infect. Dis.1791999627636
    OpenUrlCrossRefPubMedWeb of Science
  7. 7.↵
    1. Coffey T. J.,
    2. Enright M. C.,
    3. Daniels M.,
    4. Morona J. K.,
    5. Morona R.,
    6. Hryniewicz W.,
    7. Paton J. C.,
    8. Spratt B. G.
    Recombinational exchanges at the capsular polysaccharide biosynthetic locus lead to frequent serotype changes among natural isolates of Streptococcus pneumoniae.Mol. Microbiol.2719987383
    OpenUrlCrossRefPubMedWeb of Science
  8. 8.↵
    1. Cunningham M. W.
    Pathogenesis of group A streptococcal infections.Clin. Microbiol. Rev.132000470511
    OpenUrlAbstract/FREE Full Text
  9. 9.↵
    1. Dale J.,
    2. Simmons M.,
    3. Chiang E.,
    4. Chiang E.
    Recombinant, octavalent group A streptococcal M protein vaccine.Vaccine141996944948
    OpenUrlCrossRefPubMedWeb of Science
  10. 10.↵
    1. Desai M.,
    2. Tanna A.,
    3. Efstratiou A.,
    4. George R.,
    5. Clewley J.,
    6. Stanley J.
    Extensive genetic diversity among clinical isolates of Streptococcus pyogenes serotype M5.Microbiology1441998629637
    OpenUrlCrossRefPubMedWeb of Science
  11. 11.↵
    1. Dowson C.,
    2. Coffey T.,
    3. Spratt B.
    Penicillin-binding protein mediated resistance to beta-lactam antibiotics in naturally-transformable pathogens.Trends Microbiol.21994361366
    OpenUrlCrossRefPubMed
  12. 12.↵
    1. Enright M.,
    2. Day N.,
    3. Davies C.,
    4. Peacock S.,
    5. Spratt B.
    Multilocus sequence typing for characterization of methicillin-resistant and methicillin-susceptible clones of Staphylococcus aureus.J. Clin. Microbiol.38200010081015
    OpenUrlAbstract/FREE Full Text
  13. 13.↵
    1. Enright M.,
    2. Spratt B.
    Multilocus sequence typing.Trends Microbiol.71999482487
    OpenUrlCrossRefPubMedWeb of Science
  14. 14.↵
    1. Enright M. C.,
    2. Spratt B. G.
    A multilocus sequence typing scheme for Streptococcus pneumoniae: identification of clones associated with invasive disease.Microbiology144199830493060
    OpenUrlCrossRefPubMedWeb of Science
  15. 15.↵
    1. Facklam R.,
    2. Beall B.,
    3. Efstratiou A.,
    4. Fischetti V.,
    5. Kaplan E.,
    6. Kriz P.,
    7. Lovgren M.,
    8. Martin D.,
    9. Schwartz B.,
    10. Totolian A.,
    11. Bessen D.,
    12. Hollingshead S.,
    13. Rubin F.,
    14. Scott J.,
    15. Tyrrell G.
    Report on an international workshop: demonstration of emm typing and validation of provisional M-types of group A streptococci.Emerg. Infect. Dis.51999247253
    OpenUrlCrossRefPubMedWeb of Science
  16. 16.↵
    1. Feil E. J.,
    2. Holmes E. C.,
    3. Bessen D. E.,
    4. Chan M.-S.,
    5. Day N. P. J.,
    6. Enright M. C.,
    7. Goldstein R.,
    8. Hood D.,
    9. Kalia A.,
    10. Moore C. E.,
    11. Zhou J.,
    12. Spratt B. G.
    Recombination within natural populations of pathogenic bacteria: short-term empirical estimates and long-term phylogenetic consequences.Proc. Natl. Acad. Sci. USA982001182187
    OpenUrlAbstract/FREE Full Text
  17. 17.↵
    1. Fiorentino T. R.,
    2. Beall B.,
    3. Mshar P.,
    4. Bessen D. E.
    A genetic-based evaluation of principal tissue reservoir for group A streptococci isolated from normally sterile sites.J. Infect. Dis.1761997177182
    OpenUrlCrossRefPubMedWeb of Science
  18. 18.↵
    1. Gardiner D.,
    2. Hartas J.,
    3. Currie B.,
    4. Mathews J. D.,
    5. Kemp D. J.,
    6. Sriprakash K. S.
    Vir typing: a long-PCR typing methods for group A streptococci.PCR Methods App.41995288293
    OpenUrlCrossRefPubMedWeb of Science
  19. 19.↵
    1. Gupta S.,
    2. Anderson R.
    Population structure of pathogens: the role of immune selection.Parasitol. Today151999497501
    OpenUrlCrossRefPubMedWeb of Science
  20. 20.↵
    1. Gupta S.,
    2. Maiden M. C. J.,
    3. Feavers I. M.,
    4. Nee S.,
    5. May R. M.,
    6. Anderson R. M.
    The maintenance of strain structure in populations of recombining infectious agents.Nat. Med.21996437442
    OpenUrlCrossRefPubMedWeb of Science
  21. 21.↵
    1. Harbaugh M. P.,
    2. Podbielski A.,
    3. Hugl S.,
    4. Cleary P. P.
    Nucleotide substitutions and small-scale insertion produce size and antigenic variation in group A streptococcal M1 protein.Mol. Microbiol.81993981991
    OpenUrlCrossRefPubMedWeb of Science
  22. 22.↵
    1. Kapur V.,
    2. Topouzis S.,
    3. Majesky M. W.,
    4. Li L.-L.,
    5. Hamrick M. R.,
    6. Hamill R. J.,
    7. Patti J. M.,
    8. Musser J. M.
    A conserved Streptococcus pyogenes extracellular cysteine protease cleaves human fibronectin and degrades vitronectin.Microb. Pathog.151993327346
    OpenUrlCrossRefPubMedWeb of Science
  23. 23.↵
    1. Kehoe M. A.
    Cell wall-associated proteins in Gram-positive bacteria.New. Comphr. Biochem.271995217261
    OpenUrl
  24. 24.↵
    1. Kehoe M. A.,
    2. Kapur V.,
    3. Whatmore A. M.,
    4. Musser J. M.
    Horizontal gene transfer among group A streptococci: implications for pathogenesis and epidemiology.Trends Microbiol.41996436443
    OpenUrlCrossRefPubMedWeb of Science
  25. 25.↵
    1. Lancefield R. C.
    Current knowledge of the type specific M antigens of group A streptococci.J. Immunol.891962307313
    OpenUrlAbstract/FREE Full Text
  26. 26.↵
    1. Maiden M.,
    2. Bygraves J.,
    3. Feil E.,
    4. Morelli G.,
    5. Russell J.,
    6. Urwin R.,
    7. Zhang Q.,
    8. Zhou J.,
    9. Zurth K.,
    10. Caugant D.,
    11. Feavers I.,
    12. Achtman M.,
    13. Spratt B.
    Multilocus sequence typing: a portable approach to the identification of clones within populations of pathogenic microorganisms.Proc. Natl. Acad. Sci. USA95199831403145
    OpenUrlAbstract/FREE Full Text
  27. 27.↵
    1. Maynard Smith J.,
    2. Smith N. H.,
    3. O'Rourke M.,
    4. Spratt B. G.
    How clonal are bacteria? Proc. Natl. Acad. Sci. USA 90 1993 4384 4388
    OpenUrlAbstract/FREE Full Text
  28. 28.↵
    1. Moxon E. R.,
    2. Rainey P. B.,
    3. Nowak M. A.,
    4. Lenski R. E.
    Adaptive evolution of highly mutable loci in pathogenic bacteria.Curr. Biol.419942433
    OpenUrlCrossRefPubMedWeb of Science
  29. 29.↵
    1. Musser J. M.,
    2. Hauser A. R.,
    3. Kim M. H.,
    4. Schlievert P. M.,
    5. Nelson K.,
    6. Selander R. K.
    Streptococcus pyogenes causing toxic-shock-like syndrome and other invasive diseases: clonal diversity and pyrogenic exotoxin expression.Proc. Natl. Acad. Sci. USA88199126682672
    OpenUrlAbstract/FREE Full Text
  30. 30.↵
    1. Musser J. M.,
    2. Kapur V.,
    3. Szeto J.,
    4. Pan X.,
    5. Swanson D. S.,
    6. Martin D. M.
    Genetic diversity and relationships among Streptococcus pyogenes strains expressing serotype M1 protein: recent intercontinental spread of a subclone causing episodes of human disease.Infect. Immun.6319959941003
    OpenUrlAbstract/FREE Full Text
  31. 31.↵
    1. Relf W. A.,
    2. Martin D. R.,
    3. Sriprakash K. S.
    Antigenic diversity within a family of M proteins from group A streptococci: evidence for the role of frameshift and compensatory mutations.Gene14419942530
    OpenUrlCrossRefPubMed
  32. 32.↵
    1. Selander R. K.,
    2. Caugant D. A.,
    3. Ochman H.,
    4. Musser J. M.,
    5. Gilmour M. N.,
    6. Whittam T. S.
    Methods of multilocus electrophoresis for bacterial population genetics and systematics.Appl. Environ. Microbiol.511986873884
    OpenUrlFREE Full Text
  33. 33.↵
    1. Upton M.,
    2. Carter P.,
    3. Orange G.,
    4. Pennington T.
    Genetic heterogeneity of M type 3 group A streptococci causing severe infections in Tayside, Scotland.J. Clin. Microbiol.341996196198
    OpenUrlAbstract/FREE Full Text
  34. 34.↵
    1. Whatmore A. M.,
    2. Kapur V.,
    3. Sullivan D. J.,
    4. Musser J. M.,
    5. Kehoe M. A.
    Non-congruent relationships between variation in emm gene sequences and the population genetic structure of group A streptococci.Mol. Microbiol.141994619631
    OpenUrlCrossRefPubMedWeb of Science
  35. 35.↵
    1. Zurawski C. A.,
    2. Bardsley M.,
    3. Beall B.,
    4. Elliot J. A.,
    5. Facklam R.,
    6. Schwartz B.,
    7. Farley M. M.
    Invasive group A streptococcal disease in metropolitan Atlanta: a population-based assessment.Clin. Infect. Dis.271998150157
    OpenUrlCrossRefPubMedWeb of Science
View Abstract
PreviousNext
Back to top
Download PDF
Citation Tools
Multilocus Sequence Typing of Streptococcus pyogenes and the Relationships between emm Type and Clone
Mark C. Enright, Brian G. Spratt, Awdhesh Kalia, John H. Cross, Debra E. Bessen
Infection and Immunity Apr 2001, 69 (4) 2416-2427; DOI: 10.1128/IAI.69.4.2416-2427.2001

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Print

Alerts
Sign In to Email Alerts with your Email Address
Email

Thank you for sharing this Infection and Immunity article.

NOTE: We request your email address only to inform the recipient that it was you who recommended this article, and that it is not junk mail. We do not retain these email addresses.

Enter multiple addresses on separate lines or separate them with commas.
Multilocus Sequence Typing of Streptococcus pyogenes and the Relationships between emm Type and Clone
(Your Name) has forwarded a page to you from Infection and Immunity
(Your Name) thought you would be interested in this article in Infection and Immunity.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Multilocus Sequence Typing of Streptococcus pyogenes and the Relationships between emm Type and Clone
Mark C. Enright, Brian G. Spratt, Awdhesh Kalia, John H. Cross, Debra E. Bessen
Infection and Immunity Apr 2001, 69 (4) 2416-2427; DOI: 10.1128/IAI.69.4.2416-2427.2001
del.icio.us logo Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
  • Top
  • Article
    • ABSTRACT
    • MATERIALS AND METHODS
    • RESULTS
    • DISCUSSION
    • ACKNOWLEDGMENTS
    • Notes
    • FOOTNOTES
    • REFERENCES
  • Figures & Data
  • Info & Metrics
  • PDF

KEYWORDS

Antigens, Bacterial
Bacterial Outer Membrane Proteins
Bacterial Proteins
Bacterial Typing Techniques
Carrier Proteins
Streptococcus pyogenes

Related Articles

Cited By...

About

  • About IAI
  • Editor in Chief
  • Editorial Board
  • Policies
  • For Reviewers
  • For the Media
  • For Librarians
  • For Advertisers
  • Alerts
  • RSS
  • FAQ
  • Permissions
  • Journal Announcements

Authors

  • ASM Author Center
  • Submit a Manuscript
  • Article Types
  • Ethics
  • Contact Us

Follow #IAIjournal

@ASMicrobiology

       

ASM Journals

ASM journals are the most prominent publications in the field, delivering up-to-date and authoritative coverage of both basic and clinical microbiology.

About ASM | Contact Us | Press Room

 

ASM is a member of

Scientific Society Publisher Alliance

 

American Society for Microbiology
1752 N St. NW
Washington, DC 20036
Phone: (202) 737-3600

Copyright © 2021 American Society for Microbiology | Privacy Policy | Website feedback

Print ISSN: 0019-9567; Online ISSN: 1098-5522