Previous Article | Next Article ![]()
Infection and Immunity, August 2003, p. 4563-4579, Vol. 71, No. 8
0019-9567/03/$08.00+0 DOI: 10.1128/IAI.71.8.4563-4579.2003
Copyright © 2003, American Society for Microbiology. All Rights Reserved.
Israel Institute for Biological Research, Ness Ziona 74100, Israel,1 National Center for Biotechnology Information, National Library of Medicine, National Institute of Health, Bethesda Maryland 20894,2 United States Army Medical Research Institute of Infectious Diseases, Fort Detrick, Maryland 217023
Received 14 January 2003/ Returned for modification 19 March 2003/ Accepted 1 May 2003
|
|
|---|
|
|
|---|
Fully virulent forms of B. anthracis carry two large plasmids: pXO1 and pXO2. The plasmids are considered major virulence determinants, as strains lacking either one are attenuated in animal hosts (58, 67, 68). Most of the documented B. anthracis virulence factors identified so far are encoded by plasmid derived genes including the two major virulence factors: the tripartite toxin and the antiphagocytic capsule, respectively. The tripartite toxin is encoded by the genes pagA, lef, and cya, which code for protective antigen (PA) and the lethal and edema factors, respectively (located on pXO1), whereas the genes encoding for the antiphagocytic capsule are located on pXO2. A battery of as-yet-undefined virulence factors probably resides on the B. anthracis chromosome (8, 9, 12, 58), the sequence of which has been recently completed (78).
The licensed human vaccine consists of the PA component of the anthrax toxin as the principal protective immunogen. However, for effective long-term protection, multiple immunizations are required. Studies in experimental animal models indicated that the efficacy of PA vaccines is far below that of the Sterne live spore vaccine (105). Therefore, it has been suggested that additional somatic antigens and/or cellular immunity may be required for full protection. Recent studies report that immune response directed against spore antigens, either through live vaccines (17) or by supplementing PA-based vaccine with formalin-inactivated spores (11), is indeed involved in enhanced protection. Identification of spore antigens or additional vegetative antigens as possible enhancers of vaccine efficacy, could permit development of improved vaccines for human use (8, 9, 17, 58).
Until recently, the major barrier to target-based screening of potential vaccine candidates has been the limited number of cloned and characterized bacterial genes. The currently available genomic sequences of numerous human pathogens facilitates the identification, analysis and cloning of genes of interest. Genome-based selection of vaccine candidates has been recently coined "reverse vaccinology" (73-75). In silico gene selection, in combination with functional genomics studies, have been applied towards novel vaccine generation for several human pathogens (e.g., Neisseria meningitis (69), Streptococcus pneumoniae (109), and Chlamydia pneumoniae (59) and recently also to the selection of potential vaccine candidates from the B. anthracis virulence plasmid pXO1 (7).
Here we describe results of a genome-based bioinformatic screening of the entire B. anthracis draft chromosome (sequenced by The Institute for Genomic Research [TIGR], Rockville, Md.; Feb. 2001 version, 460 contigs, 98% coverage [9], following its translation and function assignments), for putative vaccine candidates. This screening process comprised of search for gene products with recognizable sequence or structural features characteristic of proteins, which either confer protective immunity (consisting mostly of surface exposed or exported proteins), or are similar to documented microbial virulence factors. The in silico approach allowed for identification of 520 B. anthracis open reading frame (ORF) products (240 with putative function and 280 hypothetical proteins or proteins of unknown function), mostly exhibiting features of surface exposed or exported proteins. Proteomic analysis of a B. anthracis membrane fraction allowed for validation of the selection process, verifying the expression of several membrane-associated candidate ORF products, and assessment of their immunoreactivity (by immunoblotting with B. anthracis immune animal sera). The employed complementary strategies could provide the basis for subsequent experimental evaluation of a future generation of anthrax vaccines.
|
|
|---|
Translation and assignment of ORFs was accomplished using the Wimklein module (SEALS package; NCBI [103]) a naive translator based on intrinsic properties of the sequence. A minimal length of 75 amino acids (aa) per ORF was imposed. The translation of the February version yielded 5,045 ORFs, with various lengths ranging from 75 to 5,017 aa.
Sequence similarity searches were conducted by running Blast analyses (3, 4), using the splishpgp module (SEALS package; NCBI), against the following databases: nonredundant (NCBI), unfinished microbial genomes (NCBI), Clostridium acetobutylicum genome (prior to its publication [65]), Bacillus halodurans genome (upon its publication [96]), and Bacillus cereus ATCC 14579 draft genome (Integrated Genomics, Ill.). The filtering program SEG (110) was applied to mask sequence segments exhibiting low compositional complexity. However, for ORFs longer than 200 aa, a nonfiltered analysis was carried out as well. Blast results were mapped according to taxonomy using the module tax_collector (SEALS, NCBI). Paralogs were identified by Blast analysis of each ORF against the B. anthracis genome database. ORFs were defined as paralogs, in cases where sequence similarity extends over 80% mutual coverage and the expectation value of the alignment is smaller than e-10. The Blast results were tabulated by the Btab program (NCBI) and further parsed by in-house Perl scripts.
Analyses of protein domains was carried out by searching against the Pfam (89), SMART (85), and CDD databases (107), using the HMM program installed on the NCBI SGI server. Orthology was assigned through analyses of the 5,045 ORFs (February version) against the COGs database (100) (carried out by R. Tatusov [NCBI]).
Cellular localization predictions, for each ORF, were carried out as follows: prediction of presence and location of signal peptides in the N-terminal 70 aa of an ORF, using the program SignalP (63) (run locally on the NCBI server; prediction of membrane-spanning regions, using the program Tmpred [29], run via the BCM Launcher batch client [87]); recognition of lipoprotein signatures by the Lipop program of the PSORT package (60) (installed locally on the IIBR server); identification of the presence of a typical sequence signature (32, 92), as well as identification of gram-positive specific anchoring motifs (13, 62), using the GREF module (SEALS, NCBI) and in-house scripts. Anchoring motifs probed include the tripartite sortase motif (32), a refined motif with increased sensitivity and specificity which includes the traditional LPXTG motif, iron-dependent sortase motif (48), choline-binding motifs (identified using the Pfam HMM PF01473), PKD (Pfam HMM PF00801), LysM (Pfam HMM PF01476) and SLH domains (Pfam HMM PF00395), (13). The draft chromosome-derived ORFs generated in this study were compared to ORFs derived from the full sequence of the B. anthracis Ames strain chromosome, obtained from TIGR (78), and to the recently deposited draft sequence of the Bacillus anthracis A2012 strain (79) (GenBank accession number NC_003995, gi|21397375), by Blast analysis. Table 1 incorporates the equivalent of TIGR ORF number and annotation, as well as the gi number of the A2012 strain ORF equivalent, for of each of our selected draft chromosome ORFs. Throughout the text, ORF products are referred to as the draft chromosome ORF number, and listed in Table 1.
|
View this table: [in a new window] |
TABLE 1. List of selected B. anthracis potential vaccine candidates and virulence factorsa
|
14185, pXO1-, pXO2- strain [17]) were cultured to late stationary stage in Luria broth medium for 48 h at 37°C with vigorous agitation. Preliminary sodium dodecyl sulfate (SDS)-polyacrylamide gel electrophoresis and Western blot analysis (data not shown) of cell membranes taken from various culture growth phases (6, 12, 24, and 48 h) established that the 48-h cells contain the largest number of polypeptides which cross-react with B. anthracis immune sera (see below). This preparation was therefore selected for the subsequent proteomic analysis of membrane proteins. The protein signature of the B. anthracis strain under study (pXO1-, pXO2-) reflects the expression of chromosomal ORFs only. However, one should keep in mind that expression of some chromosomal genes may be affected by pXO1 or pXO2 encoded regulators, as demonstrated recently for the sap gene (54). Cells were collected by centrifugation and washed twice with PBS and once in 50 mM Tris base (pH 8), followed by resuspension in 2 ml of 50 mM Tris base buffer (pH 8) at a concentration of 1010 cells/ml and sonication for 30 s. The insoluble material was collected by Eppendorf centrifugation (15 min), washed twice with Tris base buffer (pH 8) and incubated in a tube rotator for 30 min at room temperature in an extraction solution consisting of 8 M urea, 4% (wt/vol) 3-[(cholamidopropyl)-dimethylammonio]-1-propanesulfonate (CHAPS), 40 mM Tris, and 0.2 (wt/vol) Bio-Lyte 3/10 (Bio-Rad). Following centrifugation, the supernatant representing a fraction enriched in membrane proteins was stored at -70°C until use. The protein concentration of this fraction was 1.5 mg/ml. Two-dimensional gel electrophoresis (2-DE) and serological analysis. Four hundred micrograms of the membrane protein mixture was separated first by isoelectric focusing (IEF) on ready-made 17-cm, pH 3 to 10 nonlinear immobilized pH gradient (IPG) strips (Immobiline DryStrips, Amersham Pharmacia Biotech) and applied to a Protean IEF cell (Bio-Rad). IEF was carried out at 10,000 V to a total 50,000 V · h, initiated by a slow step at 250 V for 30 min. Strips were then processed for the second dimension separation by a 10-min incubation in a solution containing 6 M urea, 2% SDS, 0.375 M Tris-HCl (pH 8.8), 20% glycerol, and 2% (wt/vol) dithiothreitol, and this was followed by a 10-min incubation in a similar solution in which the dithiothreitol was replaced by 2% iodoacetamide. Strips were applied to 12.5% polyacrylamide SDS gels and electrophoresis was carried out on an Ettan DALT II System (Amersham Pharmacia Biotech). Gels were stained with Coomassie blue and spots detected and analyzed by scanning on a Bio-Rad GS-800 Calibrated Densitometer assisted by the PDQuest 2-D Gel Analysis Software (Bio-Rad). Western blots were probed with immune guinea pig serum taken 9 weeks post-first immunization, from animals inoculated with four doses (two weeks apart) of 5 x 107 B. anthracis MASC-10 spores (17). The ELISA-determined anti B. anthracis (total cellular extract) titer of the immune serum, was 1:12,800.
In-gel trypsin digestion and matrix-assisted laser desorption ionization-time-of-flight mass spectrometry (MALDI-TOF MS) protein identification.
Protein spots were cut from 2-DE gels and destained for 1 h in 30% acetonitrile (CH3CN)-50 mM ammonium bicarbonate and subjected to in-gel overnight digestion with 10 µl of trypsin (6.25 µg/ml; Promega). Gel fragments were washed twice with double-distilled water followed by peptide elution with 1% trifluoroacetic acid (TFA) (20 min, room temperature) and 50% CH3CN (20 min, room temperature). Finally, eluted tryptic peptides were dried under vacuum and resuspended in 10 µl of 25% CH3CN-0.1% TFA. Two microliters was mixed with an equal volume of 10-mg/ml
-cyano solution containing 50% ethanol, 25% CH3CN, and 0.1% TFA, applied to a MALDI-TOF MS target and allowed to crystallize in vacuum. Mass spectra were obtained on Micromass TofSpec 2E in positive ion reflectron mode, using source voltage of 20,000 V, pulse voltage of 2,600 to 3,000, and laser intensity of 20%. External calibration using standard peptides was applied. Spectra were compared to theoretical tryptic digestion fragments of all ORFs derived from the B. anthracis (pXO1-, pXO2-) Ames strain (78) genomic DNA sequence. Identification of proteins was based on peptide coverage of more than 30% and peptide mass deviation between observed and calculated values of less than 100 ppm.
|
|
|---|
Sequence similarity searches and taxonomical classification.
Blast analysis against the nonredundant database (NCBI) resulted in a preliminary categorization of the 5,045 ORF products into three groups. The first group comprised of
3,000 ORF products (Fig. 1, right lane), for which a putative function could be assigned based on significant similarity (expectation values smaller than e-3) to a known protein in the databank. The second group comprised of
800 ORF products assigned as hypothetical, uncharacterized or putative proteins (Fig. 1 middle lane). The third group included
1,200 ORF products for which no function could be assigned (including ORF products similar to a protein with an unknown function as well as B. anthracis unique ORFs, Fig. 1 left lane).
![]() View larger version (58K): [in a new window] |
FIG. 1. Strategy for reductive selection of vaccine candidates from the B. anthracis chromosome draft sequence (version of February 2001). Flowchart of the computational analysis and the filtering steps, of the B. anthracis chromosome candidate selection (left panel). Filtering steps are noted by letters (a through d, as detailed in the right panel), and the resulting number of ORF products at each step is noted in a shaded box.
|
Selection of potential vaccine candidates and virulence factors, from the ORF product set with putatively assigned function.
Taxonomy-based subtraction of B. anthracis ORFs exhibiting sequence similarity to proteins from nonpathogenic bacteria, from the 3,000 "known" ORFs (first group) resulted in
2,000 ORF products with sequence similarity to known proteins from pathogenic or eukaryotic organisms. We initially scanned the list of the 2,000 ORFs (Fig. 1) for putative housekeeping genes; all genes representing ribosomal proteins, phage proteins and fragmented genes, were subsequently removed. Due to possible future experimental restrictions, which may be imposed by the presence of more than one copy of a selected gene (complementation), ORF products with more than two paralogs in the genome were excluded as well. Similarly, in order to avoid possible cloning problems, putative proteins with more than four predicted trans-membrane segments were also removed. ORF products predicted by protein localization algorithms to code for surface associated or secreted components were selected, as well as proteins with sequence similarity to those described as surface exposed proteins in other bacteria, independent of the in silico prediction. ORF products resembling virulence-associated proteins, irrespective of their cellular location, were retained. The resulting
450 ORF products were subjected to comparative genomics analysis, removing ORF products with significant similarity (not necessarily as first Blast hit) to proteins from the nonpathogenic bacilli B. subtilis and B. halodurans (upon completion and publication of the genome sequence). The remaining ORF products were inspected individually by manual curation, leaving 240 putative ORF products with potential vaccine and/or virulence relevance as listed in Table 1. For a schematic representation of the reductive strategy, see Fig. 1.
The B. anthracis chromosome-derived list of putative vaccine candidates or virulence factors (Table 1) includes, as expected, multiple virulence-related subfamilies, namely, toxins, S-layer homology domain proteins, repeat proteins, adhesions/colonization factors, lytic enzymes, and zinc proteases, etc. All subfamilies were implicated in microbial pathogenesis in different organisms.
S-layer homology domain proteins. The B. anthracis cell surface, in the vegetative nonencapsulated state as well as the capsulated state, is covered by a cell wall polymer, known as the surface layer (or S-layer [23, 50, 52, 84]). Various functions have been assigned to the S-layer, ranging from shape maintenance to virulence, host recognition evasion, cell adhesion and resistance to phagocytosis (52, 58, 84). B. anthracis is known to synthesize two surface layer (S-layer) proteins, EA1 (extractable antigen 1) and Sap (surface array protein), which account for 5 to 10% of total cellular proteins (52). Both proteins contain a standard signal-peptide followed by three SLH (S-layer homology) anchoring motifs (50). Both proteins are considered major surface antigens and vaccine carriers in vivo (50, 52, 53). Mock and Fouet (58) were the first to report that the B. anthracis genome harbors additional genes coding for SLH proteins (other than the two SLH proteins on pXO1 previously identified by Okinaka [66]) which may constitute potential vaccine candidates (7); an amidase on pXO2 (51) and several unidentified genes located on the bacterial chromosome (58)). Inspection of the B. anthracis draft version of the chromosome for ORFs containing at least one SLH domain, revealed the presence of 20 putative S-layer homology domain proteins (including Sap and EA1, products of ORFs 2 and 3). These include cell surface-targeted enzymes such as N-acetyl-muramoyl-L-alanine-amidases (e.g., products of ORFs 4 and 10) and proteins of unknown function (Table 1, 1st category).
Adhesins. Adhesins, bacterial surface proteins which interact with receptors on the eukaryotic cell, have been studied as targets for vaccine development for many years, since blocking the primary stages of infection could be an effective strategy to prevent bacterial infections (19, 37, 108). Whole-genome sequence of a pathogen could allow for identification of putative novel adhesins based on sequence and/or structural properties shared by bacterial and intracellular human adhesins. Examples for known adhesin families include fibronectin or fibrinogen-binding proteins, collagen adhesins, etc. The B. anthracis draft chromosome was found to contain several putative fibronectin-binding proteins, ranging in size from 213 to 1,102 aa (Table 1). The product of ORF 38 is the only member of the group identified as COG 1293 (COGs database [99]) fibronectin-binding proteins and is similar to a fibronectin-binding protein from Streptococcus pyogenes, reported to confer protective immunity in mice (35). It also exhibits sequence similarity to putative fibronectin-binding proteins from B. subtilis, B. halodurans as well as the S. pneumoniae adherence and virulence protein A (42% identity). Streptococcal fibronectin-binding proteins (in particular pavA from S. pneumoniae) were shown to be essential for extra-cellular targeting and efficient cellular invasion (2, 30, 47, 49, 97, 102, 104).
The B. anthracis genome harbors one ORF product (ORF 43), which represents another type of putative adhesin: collagen-adhesin, a peptidoglycan anchored protein. Vaccination with a recombinant fragment of the S. aureus collagen adhesin, and passive transfer of collagen adhesin-specific antibodies was shown to protect mice against sepsis-induced death (64). Recently it has been reported that human-derived antibodies to the collagen adhesin, such as the ace-encoded protein from Enterococcus faecalis, expressed during infection in humans, block microbial adherence (61).
Lipoprotein adhesins and autotransporters. A different type of so-called adhesins are lipoproteins, implicated in modulation of the immune system (93). A prototypical adhesin belonging to this group is ORF 36, encoding a 311-aa protein with both secretory and lipoprotein signals and a typical metal-binding site. This ORF product was identified as belonging to COG0830a zinc-binding lipoprotein of the ABC type (surface adhesin A), involved in zinc uptake. This ORF exhibits extensive sequence similarity to streptococcal adhesins PsaA (pneumococcal surface antigen A) and colonization factors of other gram-positive organisms. The streptococcal adhesin PsaA was shown to play an essential role in virulence (10, 98). Induction of antibody response by immunization with purified PsaA protein or as a DNA vaccine, correlates with protection of mice against otherwise fatal infection with S. pneumoniae (56). Although PsaA was originally considered to be an adhesin, it was subsequently shown that the psa operon encodes a manganese permease complex (21). Recently, Marra et al. (46) have identified the psa promoter, which drives the expression of the psaBCA operon, as one of the promoters expressed during lung infection of mice (with S. pneumoniae). A similar operon appears to be present in the B. anthracis chromosome. In vivo analysis of the psa genes demonstrated the importance of this manganese transporter to S. pneumoniae virulence. Based on its properties, the product of ORF 36 may be considered a B. anthracis vaccine candidate.
Repeat-containing proteins. Many surface-proteins of gram-positive bacteria contain tandem repeat domains that can vary in size from several amino acids to several hundred amino acids. The importance of repeats in understanding protein function, resides not only in their ability to confer multiple binding and structural roles on proteins (6), but also in their possible role in antigenic variation, phase variation (documented mostly for gram-negative organisms and resulting in differential gene expression) and subsequent immune escape (6, 16, 20, 31, 38, 45). The B. anthracis genome harbors several proteins (Table 1) with tandem repeats such as TPR-like (e.g., the product of ORF 59), Ankyrin (e.g., the product of ORF 49), Collagen-like (e.g., the product of ORF 73), LRR (leucine rich repeats, e.g., the product of ORF 48) and diverse internal repeats (Table 1). Although certain repeat proteins may have a structural role, others could be involved in B. anthracis pathogenesis. Four B. anthracis ORF products harbor collagen-like repeats (multiple copies of G-X-X units that form a right-handed triple helix). Of these, the product of ORF 73 was also recently identified as a collagen-like surface glycoprotein and a structural component of the B. anthracis exosporium (95). This ORF product, like that of ORF 66, exhibits general sequence similarity to the Streptococcal collagen-like protein SclB (76, 106) but also include unique regions. Streptococcal collagen-like proteins were shown to participate in adherence to host cells and soft tissue pathology (43, 44, 76, 77). Collagen-like proteins may also be one of the examples for molecular mimicry of host proteins, exploited by many pathogens as a means for the manipulation of host response (90). In eukaryotes they act as membrane bound defense proteins, and are part of the macrophage scavenger receptor and of soluble proteins such as C1q and collectins (101).
Enzymes. The vaccine candidate list includes several enzymes (lytic enzymes, racemases, etc., Table. 1). One example is immune inhibitor A (InhA), a zinc metalloprotease, a putative virulence factor in B. thuringiensis known to inhibit the immune system of insects, by specifically cleaving antibacterial proteins produced by the insect host (41, 42). The draft sequence of the B. anthracis chromosome contains 2 paralogs (products of ORFs 87 and 105, Table 1), both secreted proteins, annotated as inhA. Blast analysis also shows significant sequence similarity to immune inhibitor-like proteins from Bacillus stearothermophilus, Vibrio cholera O1, C. acetobutylicum, and Streptomyces coelicolor (expect values ranging from 0 to e-79). Putative functional significance of this enzyme in B. anthracis-related bacilli, was first described by Charlton et al. (15), who reported its presence in the exosporium of B. cereus strain ATCC 10876. InhA gene was also shown to exist in a majority of the 23 B. cereus strains and the 3 B. anthracis strains (26). Regulation of inhA expression depends on the transition state regulator AbrB recently shown to be responsible for the timing of toxin expression in B. anthracis (83). InhA's presence on the spore surface in bacilli of the B. cereus group, suggests that it could play a role in bacterial survival in the host, as reported for other microbial zinc proteases of human microbial pathogens (57). In a recent study, one of the two Bacillus thuringiensis inh genes was shown to be required for pathogenicity via the oral route.
Another class of enzymes are prolyl racemases. Proline racemases were implicated in the virulence of organisms from several genera: Legionella (Mip macrophage infectivity potentiator [81]), Salmonella (SurA mutants are effective attenuated live oral vaccines [94]), and Trypanosoma (80). The B. anthracis draft sequence contains three ORF products with sequence similarity to a eukaryoric (Trypanosoma cruzi) B-cell mitogen and microbial (Clostridium difficile, Clostridium stricklandi, and Pseudomonas aeruginosa) proline racemases. The product of the ORF 93 is the closest sequence neighbor of the trypanosomal enzyme, maintaining the residues necessary for both catalytic and B-cell mitogenic activity of the Trypanosoma. In Trypanosoma, this enzyme is considered as a target for the development of vaccines/anti-Chagas' disease drugs (36). Mutants show attenuated virulence and reduced adhesion, and antibodies raised against the Trypanosoma enzyme reduce infectivity. The same family of enzymes has been reported as having chaperone activity facilitating preferentially maturation of outer membrane proteins and thus, proline racemases may play a dual role (71). Although these enzymes do not harbor any signal/anchoring motifs, they have been reported to be present on the surface and hence immuno-accessible (80). Their precise putative contribution to B. anthracis pathogenesis (whether direct or via a chaperone-like putative activity) needs to be evaluated experimentally.
Yet another example of enzymes with documented evidence of immunogenicity, are autolysins. Autolysins are members of a widely distributed group of enzymes that naturally digest the cell wall peptidoglycan backbone of bacterial organisms, resulting in cell lysis, death and release of inflammatory cell-wall components and cytoplasmic bacterial proteins. These enzymes are located in the cell envelope and are also presumed to play a role in a variety of cellular functions (40). According to their hydrolytic bond specificity, they are classified as muraminidases, glucosaminidases, N-acetylmuramoyl-L-alanine-amidases, amidases, and endopeptidases (88). The B. anthracis genome contains several ORFs with sequence similarity to autolysins and amidases (Table 1). The product of ORF 11 is the only amidase/autolysin that resembles a documented virulence associated autolysin (lytB S. pneumoniae autolysin). This is a 459-aa putative protein with three S-layer homology (SLH) domain anchoring domains in its N terminus (aa 1 to 201), followed by a LytB-like C-terminal domain. Most B. anthracis amidases seem to be anchored via SLH domains (either at their N terminus or C terminus [Table 1]) including a recently identified autolysin in pXO2; others are probably anchored via peptidoglycan recognition motifs (e.g., ORF 169) or SH3b domains (e.g., ORF 170); and for some, no anchoring signals were identified (e.g., ORF 84). There is growing evidence of the contribution of autolysins to microbial virulence (55). For example, LytA amidase from Streptococcus pneumoniae (82), was shown to induce a protective response when inoculated into the lungs of mice (33). The three Streptococcal cell wall hydrolases (LytA, LytB, and LytC), anchored to the membrane via teichoic acid residues, were recently shown to affect colonization in the nasopharynx in S. pneumoniae. Results of a genome-based approach to identify vaccine molecules affording protection against S. pneumoniae revealed that two out of the six proteins conferring protection are autolysins (LytB and LytC [109]).
Vaccine candidates selected from ORF products with unknown function.
The blast analysis carried out on the 5,045 ORF products resulted in
2,000 ORF products for which a function could not be assigned (Fig. 1 left and middle lanes). This finding is similar to observations in other bacterial genomes, where in spite of the increasing number of sequenced genomes, the assignment of a function to a sequence remains in many cases a challenge:
20% of the predicted ORF products in a bacterial genome do not match any entry in the databases and an additional 15 to 20% are similar to genes with no known function (24, 25).
Of the 2,000 ORF products with no clues as to their function,
1,200 had no matches in the databanks whatsoever ("totally unknown") and
800 were similar to proteins annotated in the databanks as hypothetical, uncharacterized or putative proteins. The first step towards reduction of number of candidate ORF products was identification of genes common to B. anthracis and to its taxonomically related yet nonpathogenic B. halodurans, irrespective of their putative function. This step resulted in removal of
500 ORFs, 100 from the category of unknown ORFs and 400 from the category of hypothetical ORFs (see Fig. 1). In an attempt to further reduce the number of candidates to a tangible number, filtering criteria were applied (as described above for ORF products exhibiting sequence similarity to known proteins): selection of candidates was targeted to genes encoding surface-exposed and/or secreted proteins, with less than two paralogs in the chromosome and up to four trans-membrane segments. These reductions resulted in a total of 475 unknown and 138 hypothetical ORF products satisfying the above criteria (see Fig. 1). Considering the difficulty in distinguishing short noncoding ORF products from real genes, and given the bias towards short proteins in data sets of ORF products having no matches to proteins in databases (86), a further reduction of the unknown ORF product group, based on ORF length, was carried out. A total of 138 unknown ORF products remain upon filtering out ORF products shorter than 150 aa. The resulting number of unannotated ORF products is still rather large (
280 ORF products); therefore, further reduction is necessary prior to their experimental evaluation (Fig. 1).
Serological proteomic analysis of B. anthracis membranes. The proteomic analysis of a B. anthracis membrane-associated fraction was employed as a means to verify expression of the predicted membrane-associated candidates. A partial 2-DE proteomic map, representing the membrane-associated chromosomally encoded protein repertoire of a B. anthracis late stationary culture, derived from a strain devoid of the two virulence plasmids, was generated (see Materials and Methods and Fig. 2). Close to 100 protein spots (appearing either as unique or as multiple isoforms) were extracted from the gel, digested with trypsin, analyzed by MALDI-TOF MS and identified by comparing their MALDI-TOF spectra with the hypothetical tryptic digests of the B. anthracis chromosomal (draft sequence) ORFs data set. The comprehensive proteomic study, including detailed information pertaining to the proteinous composition of the B. anthracis cell membrane, will be reported elsewhere (Chitlaru et al., submitted for publication). In order to assess the immunogenic potential of the verified gene products and to identify immunorelevant antigens, a Western blot analysis of the two-dimensional gels with immune anti-B. anthracis antiserum was carried out (Fig. 2). By comparing the Coomassie blue-stained 2-DE gels with their respective immunoblots, it appears that at least 38 protein spots (numbered 1 to 38 in Fig. 2) are recognized by the immune sera. Following MALDI-TOF MS analysis of their tryptic digest fragments, it was found that these 38 spots represent isoforms of proteins encoded by 8 distinct ORFs, as detailed in Table 2. The seropositive proteins include four SLH proteins (products of ORF 2, ORF 3, ORF 8, and ORF 19) and four enzymes, one of which (AhpC, ORF 82) exhibits a putative membranal localization lipobox signal. Apart from the two S-layer proteins EA1 and Sap, none of the other seropositive proteins (Table 2) were previously described in B. anthracis. Furthermore, with the exception of the two S-layer proteins mentioned above, which were previously reported to cross-react with B. anthracis immune sera (58), none of the proteins distinguished by the present serological proteomic analysis, were shown to elicit an immune response in B. anthracis exposed animals (see Discussion). Most notably, five out of the eight seropositive proteins (products of ORF 2, ORF 3, ORF 8, ORF 19, and ORF 82 [Table 2]) were predicted to be potential immunogens (Table 1) by the present bioinformatic analysis.
![]() View larger version (98K): [in a new window] |
FIG. 2. Serological proteome analysis of B. anthracis membranal proteins. B. anthracis ATCC 14185 (pXO1-, pXO2-) membranal proteins were separated by 2-DE (IEF on a pH 3 to 10 IPG strip). The gel was stained with Coomassie blue, for total protein spot detection. Twin gels were transferred to nitrocellulose membranes and probed with guinea-pig anti-B. anthracis immune sera (whole-cell lysate titer of 1:12,800). The Coomassie blue stain is shown on the left and the respective Western blot, on the right. The complete 2-DE gel (A) and enlarged sections (B through E) are shown. Western blots in panels A and regions B through D were probed with 1:1,000 diluted antiserum. For better resolution, the region depicted in E is taken from a 2-DE gel run on a pH 4 to 7 IPG strip (first dimension) and its respective Western blot developed with 1:300-diluted antiserum. The seropositive protein spots, are identified by running numbers. See Table 2 for the complete list of identified seropositive proteins detected in these experiments.
|
|
View this table: [in a new window] |
TABLE 2. Seroreactive proteins identified by MALDI-TOF-MS in the B. anthracis membranal fraction
|
|
|
|---|
In the study reported herein, B. anthracis putative vaccine candidates, representing proteins likely to be surface exposed, and/or similar to documented virulence related proteins, and/or contain sequence motifs characteristic of virulence factors or immunogens, were selected by a multistep computational analysis (Fig. 1) of the draft version of the B. anthracis Ames strain chromosome (February 2001, 460 contigs). Integration of the results, together with careful manual curation, resulted in identification of 520 potential antigenic proteins (240 proteins with putatively assigned function and 280 unknown/hypothetical proteins). As shown in Table 1, the 240 proteins with putatively assigned function and relevance to virulence/pathogenicity, could be grouped to several functional categories: SLH proteins, adhesins, repeat proteins, enzymes and "others." Obviously, the assignment of an ORF product to a specific group is not always unequivocal, as larger proteins frequently comprise of more than a single functional domain. However, grouping of the candidates by their putative function, facilitates selection of candidate ORF products from each group, as representatives of families documented to be involved in microbial pathogenesis. For example, the protein products of ORF 1 and ORF 4 are amidases, anchored to the peptidoglycan layer of the bacterial membrane via their SLH domains. These ORF products could be grouped either as SLH proteins or as enzymes. The rationale behind choosing these putative proteins as vaccine candidates, lies in the fact that amidases and autolysins have been shown to be involved in virulence, confer protective immunity and/or act as adhesins (e.g., see reference 109). Moreover, amidases and autolysins are modular enzymes, which probably make use of different membrane anchoring modalities (SLH and choline binding, etc.) as a means of adaptation to a particular biological niche and thus may also act as adhesins (55).
Certain microbial pathogens produce virulence factors expressed only during infection (phase variation), which harbor in their primary sequence patterns known collectively as tandem repeats (Table 1). This group of repeat proteins contains both surface anchored proteins and proteins without obvious secretion and/or anchoring signals (e.g., the most recently documented group of gram-positive virulence determinants named anchorless adhesins [16]). In addition to their reported participation in adhesion, invasion or immune-evasion, repeat proteins have been recently implicated to be involved in Fe3+ siderophore regulation (NEAT [near transporter repeat] repeat proteins [5]) and thus may affect the survival of the bacteria within the host. Examples of such putative Fe3+ siderophore regulatory proteins are the B. anthracis NEAT protein products of ORF 70, ORF 71, ORF 234, and probably also ORF 6. In particular, the products of ORFs 70 and 71, which are anchored proteins harboring several copies of the NEAT domain, are the best matches to the siderophore regulatory proteins described by Andrade et al. (5), to be present in gram-positive organisms (mostly pathogenic). As mentioned by Andrade et al., in the B. anthracis chromosome, these ORFs are indeed located adjacent to iron ABC transporters.
The candidate gene list also includes proteins of unknown function, which are anchored to the bacterial membrane by diverse gram-positive specific anchoring modes. Representatives include sortase-anchored proteins (both iron dependent and independent [13, 18, 48]). Sortases are membrane proteins, which cleave the polypeptide chain between two amino acids within a characteristic C-terminal motif and subsequently catalyze the formation of an amide bond between the carboxyl group of the cleaved polypeptide and the amino group of peptidoglycan cross-bridges. Since most sortase-anchored proteins are considered essential for bacteria to establish successful infection (39), such proteins could be relevant candidates. An example for this group is the product of ORF 228, an anchored repeat protein, relatively unique to B. anthracis, exhibiting weak similarity to a protein involved in immune evasion of Mycoplasma.
The remaining large number of candidates, selected by the bioinformatic analyses (520 putative proteins), necessitates implementation of additional filtering strategies. Reducing the number of candidates to be evaluated experimentally, to a manageable number, would involve application of a high throughput biological screening system (such as proteomic-based analysis of in vivo-expressed immunogens or in vitro-in vivo expression systems [7, 14, 25]); and/or application of additional computational steps directed toward selection of microorganism-specific genes.
B. anthracis, B. cereus, and B. thuringiensis are considered essentially one genetic species and members of the B. cereus group of bacteria (28). In spite of the overall genetic similarity, the fact that these species differ significantly in pathogenesis may imply the presence of B. anthracis-specific virulence determinants. Subtraction of genes common to B. anthracis and B. cereus 14579 (gapped genome, Integrated Genomics, Inc.) could reveal the presence of B. anthracis-specific genes. Preliminary subtraction, from the group of putative proteins with assigned functions (known proteins), of ORF products exhibiting significant overall sequence similarity to B. cereus orthologs, results in
80 B. anthracis-specific ORF products, leaving out most of the classical B. cereus group virulence factors. As for the unknown protein subgroup,
100 ORF products did not exhibit extensive sequence similarity to B. cereus 14579 proteins. Such subtraction was not performed only on the basis of threshold values (mutual coverage of 85%, expectation values smaller than e-10), but included other considerations regarding specific sequence variations (e.g., extent of insertions, divergence etc). In view of the fact that differences may also be ascribed to differences in genomic context, and in view of the limitations imposed by the quality of draft genomes versus complete genomes, this reduction should be carried out more carefully once sequencing of the more closely related B. cereus 10987 is finished. One should also keep in mind that effective immunogens may not necessarily be B. anthracis specific; thus, this type of subtraction should be probably applied only as an optional reductive measure.
In order to demonstrate the expression and cellular location of the in silico selected chromosomal gene products and in order to expedite the identification of B. anthracis immunogenic membrane and/or outer surface proteins, a direct proteomic inspection of B. anthracis membrane protein fraction was carried out. The proteomic analysis involved separation of a B. anthracis
14185 (pXO1-, pXO2-) subcellular membranal fraction by 2-DE, and identification of the most abundant protein spots by MALDI-TOF MS analysis of their fingerprint tryptic digestion products. Close to 100 spots from the 2-DE gel were analyzed and found to represent 32 proteins (detailed results of the analysis are documented in another report [Chitlaru et al., submitted]). In interpreting the proteomic data, one should take into account the fact that the proteomic approach is characterized by inherent underestimation of gene products due to (i) bias toward identification of abundant proteins (this effect is even more pronounced when one particular protein species prevails in the preparation as is the case in the membrane fraction of B. anthracis and the S-layer protein EA1 (e.g., Fig. 1, box B); (ii) differential proteins expression, depending on the origin of the membrane fraction (culture conditions and in vitro versus in vivo gene expression); (iii) sample preparation procedure. Membrane proteins are notoriously difficult to separate by 2-DE due to solubilization constraints and may not be represented in the two-dimensional map, despite their abundance. It should be noted that the bioinformatic approach may circumvent the above limitations and therefore may result in identification of gene candidates representing the complete gene repertoire of each organism, which following further individual exploration of their immunogenic potential will aid in developing improved protective and therapeutic measures.
Here we report on a serological proteome analysis carried out in order to address the issue of in vivo immunogenicity of the B. anthracis membrane proteins identified. Thirty-eight spots were found to cross-react with sera from B. anthracis infected animals (Fig. 2). The analysis also established that the cross-reactive spots, which represent the products of 8 ORFs, are indeed expressed in vivo (in guinea pigs) during exposure to B. anthracis, and are able to elicit an immune response (Fig. 2 and Table 2). Most notably, five out of these eight proteins (four SLH proteins and the AhpC/peroxiredoxin) were predicted to be potentially antigenic by the present independent in silico survey. It is worth noting that although antibodies against S-layer proteins EA1 and Sap (ORFs 3 and 2, respectively) were described before in infected animals (22, 58), neither the expression nor the in vivo immunogenicity of the other 2 novel SLH proteins (ORFs 8 and 19) was noted before. Humoral response against AhpC (ORF 82) was reported in other virulent bacterial systems such as Legionella pneumophila and Helicobacter pylori (27, 72, 91) but not for B. anthracis or B. cereus. The seropositive methylcitrate dehydrates MngE/PrpD (gi|21400220, Table2) was not previously invoked as a potential immunogen in other systems, yet it was shown to be necessary for survival of Legionella in the macrophage (72, 91). It is worth noting that about 50% of the immunogenic proteins identified in the Western blot of the 2-DE gel, are S-layer homology domain proteins (Table 2). Nevertheless, comparison of the Coomassie blue-stained 2-DE gels and their respective Western blots (Fig. 2), appears to reveal a differential order of immunopotencies among these seropositive proteins. For example, AhpC (spots 34 to 36) appears to be an exceptionally strong immunogen, since in the Coomassie blue-stained gel it appears as a weak signal while in the Western blot it appears as an intense signal.
In conclusion, as demonstrated in this study, combining bioinformatic chromosome screening with serological proteome analysis allows for judicious selection of in vivo immunogens. While the bioinformatic strategy resulted in identification of 240 vaccine candidates with putative functions (out of the 5,045 assigned and annotated ORFs derived from the chromosome B. anthracis draft sequence), the serological proteome analysis enables to focus on putative anthrax vaccine candidate genes by confirming their in vivo expression and antigenicity.
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»