Table 1.

Selection of ORFs from the S. pneumoniae serotype 4 predicted amino acid sequence

Motif name (reference)Sequence motifaNo. of ORFs selected for expression
SPase I signal sequenceb 30
SPase II signal sequenceb LXXC26
(D,E,R,K)X6(L,G)XX(V,A)-C
Cell wall anchorc LPXTG, LPXAG, LPXTN34
Choline-binding domaind 11
Homology to virulence factors22
Integrin-binding domain (22)RGD2
Type IV prepilin signal sequencee 5
Total130
  • a Amino acid sequences are represented by single-letter designations; “X” indicates any amino acid; any amino acid in parentheses may occupy the position indicated.

  • b Proteins were analyzed for SPase I signal sequences using P-sort (17) or SignalP (19) algorithms and were predicted to have either cleavable or noncleavable SPase recognition sites. ORFs containing putative SPase I or II signal sequences (25) were further evaluated for the presence of a methionine start codon preceding a sequence encoding a short (<30 amino acids) hydrophobic region as predicted by Kyte-Doolittle analysis using the LaserGene DNAStar software package.

  • c In addition to the listed sequence residues, proteins with cell wall-anchoring motifs were also examined for a characteristic N-terminal hydrophobic region followed by at least one basic residue (R or K) after the anchoring motif (8, 18).

  • d Choline-binding proteins were predicted by comparison to the C-terminal repeat region of PspA and to the consensus domain previously described for other gram-positive bacteria (26,28).

  • e Type IV prepilin signal sequences were identified by comparison to the ComG locus of Bacillus subtilis (7).