ABSTRACT
DNA microarray analysis was used to compare the differential gene expression profiles between Leptospira interrogans serovar Lai type strain 56601 and its corresponding attenuated strain IPAV. A 22-kb genomic island covering a cluster of 34 genes (i.e., genes LA0186 to LA0219) was actively expressed in both strains but concomitantly upregulated in strain 56601 in contrast to that of IPAV. Reverse transcription-PCR assays proved that the gene cluster comprised five transcripts. Gene annotation of this cluster revealed characteristics of a putative prophage-like remnant with at least 8 of 34 sequences encoding prophage-like proteins, of which the LA0195 protein is probably a putative prophage CI-like regulator. The transcription initiation activities of putative promoter-regulatory sequences of transcripts I, II, and III, all proximal to the LA0195 gene, were further analyzed in the Escherichia coli promoter probe vector pKK232-8 by assaying the reporter chloramphenicol acetyltransferase (CAT) activities. The strong promoter activities of both transcripts I and II indicated by the E. coli CAT assay were well correlated with the in vitro sequence-specific binding of the recombinant LA0195 protein to the corresponding promoter probes detected by the electrophoresis mobility shift assay. On the other hand, the promoter activity of transcript III was very low in E. coli and failed to show active binding to the LA0195 protein in vitro. These results suggested that the LA0195 protein is likely involved in the transcription of transcripts I and II. However, the identical complete DNA sequences of this prophage remnant from these two strains strongly suggests that possible regulatory factors or signal transduction systems residing outside of this region within the genome may be responsible for the differential expression profiling in these two strains.
Leptospirosis is a globally important zoonosis caused by infection of pathogenic Leptospira species, including L. alexanderi, L. borgpetersenii, L. interrogans sensu stricto, L. kirschneri, L. noguchii, L. santarosai, L. weilii, L. fainei, L. inadai, and L. meyeri (1, 24). It affects a wide range of mammalian hosts, and human beings and animals become infected through contact with urine-contaminated soil and water. Because of the large spectra of animal species that serve as reservoirs, leptospirosis is considered to be the most widely spread zoonosis.
Prophages are commonly found in sequenced bacterial genomes, many of which appeared to be defective and in a state of mutational decay (15, 18-20). However, both intact and defective prophages may play a special physiological role in the host bacteria and, thus far, they have been implicated in serotype conversion, pathogenesis, and phage immunity (13, 15, 16, 18, 20, 22, 29, 38, 53). Genomic sequences of L. interrogans have been determined for two strains of different serotypes (40, 43), but no intact phage sequence was identified in any one of them.
In the present study, we compared the gene expression profiles between the virulent L. interrogans serogroup Icterohaemorrhagiae serovar Lai type strain 56601 and its corresponding high-passage avirulent strain IPAV under the same in vitro growth conditions and noticed that a fragment of the genomes, ranging from genes LA0186 to LA0219 was differentially expressed. Further analysis showed that this fragment was a defective prophage remnant encoding many phage-related genes. We further identified the five transcripts within this prophage region and characterized the possible regulatory protein encoded by one of the prophage genes (gene LA0195).
MATERIALS AND METHODS
Bacterial strains, plasmids, and culture conditions.The bacterial strains and plasmids used in the present study are listed in Table 1. The virulent strain 56601 and the avirulent variant strain IPAV of L. interrogans serogroup Icterohaemorrhagiae serovar Lai were grown in EMJH liquid medium (33) at 28°C (used for DNA extraction), the optimum growth temperature of pathogenic Leptospira in vitro or 37°C (used for RNA extraction), a temperature similar to that encountered during invasion of mammalian hosts under aerobic conditions. Only mid-log-phase cultures of 100 ml each at a density of ∼108 ml−1 were used for gene expression profiling. The cells were harvested by centrifugation at 10,000 × g for 10 min at 4°C.
Bacterial strains and plasmids used in this study
Escherichia coli was routinely grown in LB medium at 37°C (46), except where indicated otherwise. Antibiotics, when required, were used at the following concentrations: ampicillin at 100 μg ml−1, kanamycin at 50 μg ml−1, and chloramphenicol at 34 μg ml−1.
DNA manipulation.Genomic DNA of L. interrogans was extracted by using a bacterial DNA minikit (Watsonbiot, China). PCR fragments were purified from agarose gel with a QIAquick gel extraction kit (Qiagen). Restriction endonucleases were purchased from various suppliers and were used according to their specifications. Plasmids DNA from E. coli were prepared by using a plasmid minikit (Tiangen, China).
The genomic draft sequence of the L. interrogans serotype Lai avirulent variant strain IPAV was determined by 454 sequencing (unpublished data). The finished sequence corresponding to the gene cluster analyzed here was generated by direct Sanger sequencing of the PCR-amplified gap-filling fragments.
Bioinformatics analysis.The complete genomic sequence of virulent L. interrogans serovar Lai type strain 56601 is available (http://www.chge.sh.cn/lep/ ). The phage-like sequences were identified in silico via protein-protein BLAST search (http://www.ncbi.nlm.nih.gov/BLAST/ ) and were compared to the annotated DNA sequence of the corresponding genomic region of the strain IPVA. Conserved protein structural domains and motifs were searched online (http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml ), and the Pfam domains were also searched online (http://pfam.sanger.ac.uk/ ). Transcription operons and the corresponding promoters were predicted using Softberry software (Softberry, Mount Kisco, NY). Possible symmetrical repeats located within the putative promoter/operator regions of the perspective transcripts were detected by using an in silico analysis tool (http://www.mgs.bionet.nsc.ru/mgs/programs/OligoRep/InpForm.htm ).
RNA isolation. L. interrogans strains were cultured in EMJH medium at 37°C to mid-log phase, and total RNA was extracted from triplicate independent samples. RNA was extracted by using TRIzol reagent (Invitrogen) according to the manufacturer's protocol. Contaminating DNA was digested with RQ1 RNase-free DNase (Promega). The treated RNA was purified by using a Qiagen RNeasy kit, while the quality was verified by agarose gel electrophoresis, and the quantity was determined spectrophotometrically (Eppendorf).
Microarray analysis.The microarray chip was prepared as previously described (42). In brief, the 3,528 protein coding sequences (CDSs) successfully amplified by PCR were printed in triplicate on the slides as probes. Three expression profiling experiments were performed via reverse transcription (RT) using Superscript (Invitrogen); in one experiment the cDNA of IPAV was labeled by Cy3 and the cDNA of 56601 was labeled by Cy5, while in the other two experiments the cDNA of 56601 was labeled by Cy3 and that of IPAV was labeled with Cy5. The unincorporated dye was removed by using a QIAquick nucleotide removal kit (Qiagen) as specified by the manufacturer's protocol. Samples were hybridized at 42°C for 16 h and then washed as described in the manual. Hybridization experiments were performed in triplicate by using cDNA derived from three different cultures of virulent strain 56601 and avirulent strain IPAV grown at 37°C. The hybridized slides were scanned and analyzed by Tiffsplit (Agilent) to calculate the signal intensities and to determine the presence or absence of each CDS. The data were then normalized, and their backgrounds were defined by using GeneSpring 4.0 (Silicon Genetics). The GeneSpring software was used to further analyze the transcription patterns. To identify genes with significantly altered expression levels, cutoff values for expression level ratios 1.5 were used to filter genes with changes (n-fold) greater than ±1.5 in three independent biological samples (27, 48, 50). A Student t test analysis of variance was used to compare the mean expression levels of the test and reference samples. Genes with significant differential expression levels (P < 0.05) were selected.
Real-time PCR.The same RNA samples subjected to microarray analysis were reverse transcribed to cDNA by using Superscript II (Invitrogen) to confirm the changes of expression for selected genes by quantitative PCR (Roche) as described previously (47) with Sybr green dye (Invitrogen). For each amplification run, the CT (threshold cycle) of 16S rRNA gene amplified from the corresponding sample was used as the internal control to normalize the tested gene amplicon for CT calculation (37, 47), and the relative fold changes were further calculated as described previously (37).
Operon transcript analysis.The AMV Reverse Transcription system (Promega) was used for RT experiments according to the manufacturer's instructions. PCR primers designed to detect transcripts are listed in Table 2. To confirm that the RNA sample was free of contaminated DNA, control PCR amplifications with RNA sample plus the primers described above were performed. Control PCR amplifications that produced a PCR fragment indicating the contamination of genomic DNA were excluded from further analysis.
Primers used for RT-PCR analysis for transcript identification
Promoter analysis.Several domains of the promoter/operator region for a prospective transcript were cloned into promoter probe vector pKK232-8 (Amersham Pharmacia Biotech, Sweden), upstream of the cat gene. PCR primers for the putative promoters of the perspective transcripts were as follows with BamHI and HindIII restriction sites introduced at the 5′ ends of the primers (underlined): transcript I, LA0186-LA0194p-forward (5′-CGGGATCCTCCAACTCAGAATCAGAAAC-3′) and LA0186-LA0194p-reverse (5′-CCCAAGCTTCTCAAAAAACTCATCGAAAT-3′); transcript II, LA0195p-forward (5′-CGGGATCCGGTATATGAATGTTGTCCGT-3′) and LA0195p-reverse (5′-CCCAAGCTTCATTTTTAACGTATCGTTGT-3′); and transcript III, LA0198-LA0202p-forward (5′-CGGGATCCGATCCATTCATTCGTACCTT-3′) and LA0198-LA0202p-reverse (5′-CCCAAGCTTCCTGTTTGGATCGAATACTA-3′).
The PCR products and pKK232-8 were digested with BamHI plus HindIII and purified with agarose gel by using a QIAquick gel extraction kit (Qiagen). The PCR products were then ligated into promoter probe vector pKK232-8 individually creating three recombinant putative promoter-cat fusion plasmids designated pKP0194, pKP0195, and pKP0198 (Table 1).
To assay the promoter activities, E. coli XL1-Blue harboring putative promoter-cat fusion plasmid was grown at 37°C to mid-log phase in LB supplemented with ampicillin. The culture was then subcultured at a dilution of 1:100 in 5 ml of LB containing 100 μg of ampicillin ml−1. Chloramphenicol acetyltransferase (CAT) activities were determined by using an enzyme-linked immunosorbent assay test (Roche, Germany) according to the manufacturer's instructions on crude bacterial extracts prepared by sonication. E. coli. XL1-Blue transformed with pKK232-8 without insert was used as a control.
Cloning and overexpression of the LA0195 protein.Based on the DNA sequences of the LA0195 gene, the primers for cloning were designed with BamHI and HindIII restriction sites incorporated (underlined) as follows: LA0195-forward (5′-CGGGATCCGATGTCAAGAATTAATACGA-3) and LA0195-reverse (5′-CCCAAGCTTGGGTTATTTTTTCTTTCTAAA-3′).
The PCR products were ligated to the His6 expression vector pET28b (Novagen) under the T7 promoter after digestion with BamHI and HindIII, creating the plasmid pET28b-0195. pET28b-0195 was introduced into E. coli BL21(DE3) competent cells as described previously (21). Induced by IPTG (isopropyl-β-d-thiogalactopyranoside), cloned LA0195 gene was overexpressed in E. coli, and the protein was purified on Ni-NTA agarose (Qiagen) in Tris-HCl buffer eluted by imidazole gradients. The protein was detected by sodium dodecyl sulfate-polyacrylamide gel electrophoresis.
Electrophoretic mobility shift assay (EMSA).Three probes were generated by PCR using 5′-biotin-labeled and unlabeled primers: probe 1, a 323-bp fragment containing a putative promoter of the transcript I sequence (primers 5′-CGGGATCCTCCAACTCAGAATCAGAAAC-3′ and 5′-CCCAAGCTTCTCAAAAAACTCATCGAAAT-3′); probe 2, a 279-bp fragment containing a putative promoter of the transcript II sequence (primers 5′-CGGGATCCGGTATATGAATGTTGTCCGT-3′ and 5′-CCCAAGCTTCATTTTTAACGTATCGTTGT-3′); and probe 3, a 371-bp fragment containing a putative promoter of the transcript III sequence (primers 5′-CGGGATCCGATCCATTCATTCGTACCTT-3′ and 5′-CCCAAGCTTCCTGTTTGGATCGAATACTA-3′).
Probes were purified with QIAquick gel extraction kit. An EMSA was performed by using a LightShift Chemiluminescent EMSA kit (Pierce). The purified protein was incubated with probe at 30°C for 20 min in a 20-μl binding reaction containing 1× binding buffer, 5 mM MgCl2, 2.5% glycerol, 0.05% NP-40, 1 μg of poly(dI-dC), and 10 fmol of biotin-labeled probe. A range of concentrations from 0 to 0.2 μg of the LA0195 protein were tested (details are provided in the legend to Fig. 4). Competitor experiments with 50- and 100-fold excesses of unlabeled probe as a specific competitor or poly(dI-dC) as a nonspecific competitor were used to demonstrate the specificity of LA0195 protein binding. Samples were subjected to electrophoresis at 120 V in 1% agarose gel in 0.5× Tris-borate-EDTA for 1.5 h; the gel was then electrophoretically transferred to nylon membrane at 380 mA for 60 min, and cross-linked DNA was transferred to the membrane by using a UV-light cross-linker (UVP, Upland, CA). After the membrane was cross-linked, biotin-labeled DNA was detected by chemiluminescence and exposed to X-ray film for 5 to 10 min.
RESULTS
The LA0186 to LA0219 gene cluster of L. interrogans serovar Lai virulent strain 56601 was actively transcribed in its counterpart avirulent strain IPAV.Microarray-based comparative expression profiling of L. interrogans strain 56601 against its avirulent counterpart strain IPAV was performed with samples cultured in EMJH medium at 37°C. These microarray expression profiles have been deposited at the National Center for Biotechnology Information Gene Expression Omnibus database (http://www.ncbi.nlm.nih.gov/geo/info/linking.html ), and the corresponding accession number is GSE10788. The results from all of the biological independent triplicates indicated that a gene cluster ranging from genes LA0186 to LA0219 was concomitantly expressed and upregulated in the type strain 56601 compared to its avirulent counterpart strain IPAV (Table 3). However, although the trends of the expressions were the same in three experiments, the fold difference in the third sample was lower than that of the first two samples. Real-time RT-PCR was performed on the third sample, which confirmed the expression profiling data and thus indicated that the discrepancy was not due to the operational error (data not shown).
Functional annotation and expression profiling of the prophage-like genomic island (genes LA0186 to LA0219)
The complete DNA sequence of this gene cluster from strain IPAV was determined, which is identical to that from strain 56601 (data not shown). Therefore, the genetic determinants that probably accounted for the expression differences of this gene cluster in these two strains must reside in the genome rather than in this region. Further efforts were thus focused on identification of the gene cluster rather than a mechanistic study of the differential expression.
Gene annotation of the LA0186 to LA0219 gene cluster revealed characteristics of a putative prophage-like genomic island.The active concomitant expression of this gene cluster in both virulent and avirulent strains of L. interrogans serovar Lai led to a comprehensive reannotation of the gene cluster. The average (G+C) content of this gene cluster is 35.1% similar to that of the whole genome (36.0%). BLAST analysis of 34 CDSs within this cluster versus the nonredundant protein database and the conserved domain from the GenBank and Pfam showed that, although the majority of these CDSs (n = 21) are novel genes that failed to exhibit any similarities to proteins of known function in other organisms as previously annotated, in addition to the 5 previously annotated functional CDSs (Table 3), another 8 CDSs also exhibited homology with genes annotated by the National Center for Biotechnology Information.
Of the 13 annotated CDSs, 8 encoded proteins with various degrees of amino acid sequence homology with the ones likely related to phage and prophages (Table 3 and Fig. 1). The LA0205-, LA0209-, LA0211-, and LA0215-encoded proteins exhibit individually different degrees of homology with phage head or tail morphogenesis of different phages. The LA0189 gene was most similar to the gam gene of the Mu phage. The LA0193-encoded protein exhibits low homology with the putative phage integrase family of Pseudoalteromonas haloplanktis TAC125. The putative protein encoded by the LA0213 gene exhibits low homology with transposase, while the LA0195 gene encodes a transcriptional regulator exhibiting some homology with the prophage repressor CI. Based on the presence of phage functional and regulatory homology genes but the lack of plasmid replication or partitioning genes, we hypothesize that this fragment may not encode functions for autonomously extrachromosomal replication ability. Remarkably, at one end of this prophage-like cluster, the LA0219 protein was 70% similar to the LA0218 protein in amino acid sequence, which might be due to rearrangement.
Linear presentation of the prophage-like genome island, with its five transcripts tested by RT-PCR. Upper rectangles indicate the transcripts transcribed from right to left, and lower rectangles are transcribed from left to right. Transcripts I to V were illustrated with all of the composed CDSs annotated indicated with three kinds of arrows corresponding to three categories of functional annotation. The eight genes putatively related to the prophages are illustrated (refer to the text for details). The color arrows represent the primers used for RT-PCR identification of the transcripts (see Fig. 2). Forward and reverse primers for a single PCR are labeled with the same color. Refer to the GenBank annotation for transcription directions.
The LA0186-LA0219 gene cluster comprised five transcripts.The genes within this gene cluster have both transcriptional directions shown by genomic annotation. Cotranscription of these genes was analyzed by RT-PCR using L. interrogans RNA as a template (Fig. 2). Primers for this analysis annealed to adjacent or nearby genes, so that amplification in RT-specific reactions would occur only if the genes were cotranscribed (55). The result of RT-PCR assay revealed five transcripts (Fig. 1 and 2). The gene cluster LA0186-LA0194 composed of one transcript was transcribed in a counterclockwise direction and was designated transcript I. The LA0195 gene was transcribed singly in a clockwise direction and was designated transcript II. Transcript III comprised genes LA0196 and LA0197 and was transcribed in an counterclockwise direction. The other two gene transcripts, LA0198-LA0202 and LA0203-LA0219, were transcribed in a clockwise direction and were designated transcripts IV and V, respectively.
RT-PCR identification of cotranscribed putative operons within the LA0186-LA0219 gene cluster. RNA was isolated, and RT-PCR was performed as described in the text. Primer pairs spanning two or more of the genes (indicated above each lane) were used to determine whether adjacent or nearby genes are cotranscribed. Molecular size markers in kilobases are indicated. −RT, a negative control, in which the RT step was omitted.
The putative promoters of transcripts I and II are actively expressed in E. coli, whereas very low expression activity was detected for the putative promoter of transcript III.The putative promoter or regulatory regions of transcripts I, II, and III, all proximal to gene LA0195 but with different transcription directions, were identified by using the Softberry prokaryotic promoter prediction tools. The analysis results indicated that the specific promoter-regulatory sequences (−10 and −35 box) were located upstream of transcripts I, II, and III at distances of −146, −144, and −123 bp, respectively, from the translational start site. At the same time, we examined the sequence between transcripts I and II for possible symmetrical repeats by using an in silico analysis tool and found one direct repeat and one inverted repeat sequence in the corresponding regions as shown in Fig. 3.
Putative −35 and −10 promoter sequences of transcripts I and II and the putative repeat sequences. Bases underlined in blue were putative −35 and −10 boxes of transcripts I and II. The translational start codon of transcripts I and II are underlined in red, and the translational directions are indicated. The putative repeat sequences are indicated in colors.
Due to the lack of effective genetic tools (25), L. interrogans promoter activity identification was previously tested in E. coli (36). The promoters described above were subcloned into the multicopy vector pKK232-8 (14) in order to generate promoter fusion plasmids pKP0194, pKP0195, and pKP0198 with the promoterless cat reporter gene in a positive orientation. The relative strengths of the promoters were measured by determining the CAT activities in E. coli. The putative promoters of transcripts I and II were detected with high levels of CAT enzyme activities, but very low level of CAT activity was detected for the putative promoter of transcript III (Table 4).
E. coli CAT assay to determine the transcription activities of three putative L. interrogans prophage promoters determined in this study
EMSA showed that the LA0195 protein could bind to the putative promoters of transcripts I and II but not to the putative promoter of transcript III.The amino acid sequence of protein LA0195 is similar to that of the prophage repressor CI protein, and it contains a conserved DNA-binding domain. It is known that CI repressor can bind the specific DNA sequence upstream of and proximal to its own structure gene (7, 9, 34). Protein LA0195 was overexpressed and purified. By sodium dodecyl sulfate-polyacrylamide gel electrophoresis, we showed it to have a molecular mass of 21.6 kDa (data not shown).
Transcript I was transcribed head to head with the LA0195 gene. EMSA was performed with 10 fmol of the 5′-biotin-labeled probe 1 corresponding to the 323-bp putative promoter regulatory region of the transcript I by incubation with purified recombinant protein LA0195. As shown in Fig. 4A, a shifted band was observed when 0.02 μg of the LA0195 protein was added to the reaction mixture (Fig. 4A, lane 3), and the intensity of the shifted band became stronger along with the increase of LA0195 protein added up to 0.05 and 0.1 μg (Fig. 4A, lanes 4 and 5). This binding was effectively competed for by 50- and 100-fold excesses of unlabeled probe 1 (Fig. 4B, lanes 4 and 5), whereas the addition of 50- and 100-fold amounts of nonspecific competitor probe of poly(dI-dC) under the same experimental conditions could not prevent LA0195 protein binding (Fig. 4B, lanes 6 and 7). These two control experiments indicated that the binding is sequence specific.
EMSAs for the binding of recombinant protein LA0195 to putative promoter probes representing the upstream regions of transcripts I and II. Each reaction (20 μl) containing 10 fmol of biotin-labeled promoter probe is presented in one lane of the gels. (A) Probe for transcript I with increasing concentrations of protein LA0195, with lanes 1 to 6 corresponding to 0, 0.01, 0.02, 0.05, 0.1, and 0.2 μg, respectively. (B) Probe for transcript I. Lane 1, biotin-labeled probe alone; lanes 2 and 3, biotin-labeled probe with protein LA0195 at 0.1 and 0.2 μg, respectively; lanes 4 and 5, biotin-labeled probe with 0.2 μg of protein LA0195 in the presence of 50- and 100-fold unlabeled probes, respectively; lanes 6 and 7, biotin-labeled probe with 0.2 μg of protein LA0195 in the presence of 50- and 100-fold nonspecific competitors, respectively. (C) Probe for transcript II with increasing concentrations of protein LA0195, with lanes 1 to 6 corresponding to 0, 0.01, 0.02, 0.05, 0.1, and 0.2 μg, respectively. (D) Probe for transcript II. Lanes 1 and 2, biotin-labeled probe with protein LA0195 at 0.1 and 0.2 μg, respectively; lanes 3 and 4, biotin-labeled probe with 0.2 μg of protein LA0195 in the presence of 50- and 100-fold unlabeled probes, respectively; lanes 5 and 6, biotin-labeled probe with 0.2 μg of protein LA0195 in the presence of 50- and 100-fold nonspecific competitor, respectively.
The transcript II contains the LA0195 gene only. As shown in Fig. 4C, a shifted band was observed when 0.05 μg of the LA0195 protein was added to the reaction mixture with probe 2, corresponding to the 279-bp putative promoter of transcript II (Fig. 4C, lane 4). Competition analyses with the unlabeled probe and the nonspecific competitor probe (Fig. 4D, lanes 3 to 6) also indicated that this binding was sequence specific.
EMSA was also performed with the purified recombinant protein LA0195 with probe 3, corresponding to the 371-bp putative promoter of transcript III. No band shift was observed under the same experimental conditions described above (data not shown).
DISCUSSION
Prophages, including defective ones, may play an important role in the evolution of the host bacteria by contributing extra genetic components encoding certain biological properties, including virulence. Defective prophages have frequently been discovered in complete bacterial genomic sequences (35, 39, 51, 52), and many of them harbored some functional genes (6, 38, 49, 51). In the present study, a prophage-like gene cluster in L. interrogans was detected by whole-genome expression profiling and further characterized based on the identification of both cotranscription patterns and a CI-like phage repressor protein.
Active expression of genes covering the region from genes LA0186 to LA0219 was previously observed in whole-genome expression profiling experiments comparing in vitro growth conditions at 28°C versus those at 37°C (42). However, no concomitant up- or downregulations were constantly observed under these conditions. However, a concomitant differential expression in virulent strain 56601 versus the avirulent strains IPAV of serovar Lai was observed, albeit the level of variation was not quantitatively consistent. Because the DNA sequence of this studied region was identical between the two strains, we focused our analysis on the identification of the biological characters of this gene cluster rather than the mechanism that might account for the uncertain differential expression.
Transcription analysis showed that the gene cluster was expressed in five transcripts, i.e., organized into five operons, designated transcripts I to V (Fig. 2). Reannotation of this gene cluster assigned 13 out of the total 34 CDSs to some previously defined proteins. Among them, five CDSs encoding proteins apparently unrelated to prophages, such as the LA0202 gene, which encodes a pore-forming toxin protein that may permeabilize cell membranes by forming small pores (31). It has been reported these toxins are the most common class of bacterial protein toxins and are important for bacterial pathogenesis (2, 5).
Among the eight phage-related genes, LA0195, the only CDS of transcript II, is the only one encoding a putative regulatory protein based on BLAST analysis. Further, Pfam and Motif analysis indicates that LA0195 protein belongs to the HTH_3 family, which contains a helix-turn-helix domain ranging from the 20th to the 75th amino acid residues at its N terminus. It belongs to a large family of DNA-binding proteins that contain 71 members, including a bacterial plasmid copy number control protein, bacterial methylases, various bacteriophage transcription control proteins, and a vegetative specific protein from Dictyostelium discoideum (http://pfam.sanger.ac.uk/family?acc=PF01381 ). The molecular function of this large family is sequence-specific DNA binding. Remarkably, with respect to the amino acid sequence, the LA0195 protein is similar to the prophage repressor-like CI protein, which is included in this family. For most prophages, the main regulatory switch protein repressor CI usually coordinates the regulation with another protein, Cro. The cro gene is usually closely linked with cI and is transcribed in head-to-head directions to the cI gene, sharing a 14-bp dyad symmetry operator site for CI protein binding (3, 9, 10, 30, 51). However, neither cro nor the 14-bp dyad symmetry operator was found in this gene cluster. In contrast, the LA0195 gene is a singly transcribed gene, indicating that some substantial rearrangement might have taken place in this region since the integration of the prophage.
The putative promoter/operator sequence proximal to the LA0195 gene was further analyzed to test the probable regulatory function related with the putative trans-factor LA0195 gene via in vitro EMSA and in vivo CAT report activity assays in E. coli. The EMSA results showed that protein LA0195 may bind to the strong promoter (tested in vivo) of transcripts I and II, although both of them lack the 14-bp dyad symmetry operator of CI protein binding sites (7, 9, 10, 32, 34, 51). On the other hand, protein LA0195 failed to bind to the putative promoter region of transcript III, and this fact echoed the weak transcription initiation activity of this region tested in E. coli.
The predicted −35 and −10 consensus sequences of the putative promoters for transcripts I and II are both located in the coding regions (Fig. 3). Therefore, the binding of RNA polymerase to the promoter of transcript I located in the CDS of the LA0195 gene might hinder the transcription of transcript II, whose promoter was located in the CDS of the LA0194 gene, the first gene of transcript I. Therefore, one may speculate that the function of regulatory protein LA0195 is to control the transcription switch between transcripts I and II in response to environmental conditions.
Prophage is apparently acquired as easily as they are lost from the chromosome. Prophage genes without selective value to the host are therefore likely to be deleted (18). Bacterial phage particles have been observed in many different spirochetes (4, 8, 17, 23, 26, 28, 29, 41, 44, 45), and most of them can be induced by mitomycin C, ciprofloxacin, or N-methyl-N′-nitro-N-nitrosoguanidine (MNNG) treatment (54). Some of the bacteriophages in spirochetes that have been studied focus their function in the host, such as LE1 of Leptospira biflexa (11, 25); cp32s, a circular plasmid (23); phiBB-1, a bacteriophage of Borrelia burgdorferi (49); and VSH-1, prophage-like gene transfer agent of Brachyspira hyodysenteriae (38). However, no extra DNA content has been found to be part of the virulent L. interrogans genome except for a previously reported genomic island in L. interrogans serovar Lai, which could be excised from a hypothetical prophage (12). Therefore, our report here confirming the gene cluster LA0186-LA0219 as a prophage remnant in the L. interrogans chromosome is the first of its kind.
It was interesting to learn that the differential expression of this prophage-like remnant is unrelated to its DNA sequence. This suggests that complex mechanisms, such as signal transduction or the presence of trans-active coregulators for transcription initiation, may regulate the expression of this remnant corresponding to the environment changes. Further studies on the regulation of leptophage may also contribute to the evolution of the pathogenic mechanism of L. interrogans.
ACKNOWLEDGMENTS
This study was supported in part by the National Natural Science Foundation of China (grants 30370071 and 30670102), the National High Technology Research and Development Program of China (grant 2006AA02Z176), and the Shanghai Leading Academic Discipline Project (T0206).
We are grateful to M. Picardeau (Unite de Bacteriologie Moleculaire et Medicale, Institut Pasteur, Paris, France) and Yu-Feng Yao (Department of Microbiology and Parasitology, Shanghai Jiao Tong University School of Medicine) for thoughtful comments on the manuscript.
FOOTNOTES
- Received 23 December 2007.
- Returned for modification 7 January 2008.
- Accepted 13 March 2008.
- Copyright © 2008 American Society for Microbiology