Tag: Bacterial genetics

  • Identification and genetic dissection of convergent persister cell states

    Identification and genetic dissection of convergent persister cell states

    [ad_1]

    Bacterial strains and culture conditions

    E. coli MG1655, UPEC CFT073, and derivative mutant strains (Supplementary Table 6) were routinely grown at 37 °C with shaking. One millilitre of culture was grown in 14-ml round-bottom culture tubes shaking at 300 rpm, or larger volumes were grown in flasks at lower shaking speed. For all liquid experiments, we used supplemented M9 medium (SM9)8 (1× M9 salts (DF0485-17, Fisher Scientific), 0.4% glucose, 2 mM MgSO4, 0.1 mM CaCl2, 2 μM ferric citrate, 3.1 g l−1 Neidhardt Supplement Mixture (NSM01, ForMedium) and micronutrient supplement as previously described61). Neidhardt Supplement was autoclaved for 20 min, stirred for 10 min, and then the other medium components were added.

    Semisolid medium was prepared as previously described62 but using SM9. To prepare semisolid SM9, Neidhardt Supplement Mixture was combined with SeaPrep agarose (3.5 g l−1, Lonza) and autoclaved for 22 min. Remaining medium components were added after autoclaving, as done for SM9 broth. Semisolid medium was then cooled to 37 °C. Cells were inoculated into semisolid medium at 37 °C and then briefly stirred. The resulting culture was placed in an ice bath for 30 min to let the medium gel. The ice bath must reach higher than the liquid level of the medium to evenly chill the entire volume. The semisolid culture was carefully transferred to 37 °C.

    For strain construction and plasmid preparation, E. coli were grown in LB Miller Broth (DF0446-07-5, Fisher Scientific). LB Miller plates were used for growth on solid medium unless otherwise noted. For plasmid maintenance, plates or broth were supplemented with 25 μg ml−1 chloramphenicol, 50 μg ml−1 kanamycin, or 50 μg ml−1 carbenicillin. Cells were routinely pelleted by centrifugation at 5,000g for 5 min.

    Plasmid construction

    prmf-RFP and pmdtK-RFP were assembled by NEB HiFi assembly (E2621L) using pBbA6C-RFP63 as backbone (amplified with SB226 and SB227) and prmf-GFP64 or pmdtK-GFP64 as insert (amplified with SB224 and SB228).

    All plasmids are listed in Supplementary Table 6. pBbS6C-dcas9, pBbS6C-metG, pBbS6C-metG*, and pBbS6C-yqgE were cloned by NEB HiFi assembly (E2621L) using a common vector backbone (pBbS6C-RFP63 amplified with SB167 and SB168) and the following gene inserts: pWJ445 amplified with SB170, SB171 (dCas9); MG1655 genomic DNA (gDNA) amplified with SB150, SB169 (metG); MG1655-metG* gDNA amplified with SB150, SB169 (metG*); MG1655 gDNA amplified with SB202, SB203 (yqgE).

    To assemble pYqgE+ and pRFP+, intermediate plasmid pBbE6A-RFP was assembled by ligation of ZraI- and XhoI-digested pBbS6C-RFP63 and pBbE2A-RFP63. pBbE6A-RFP was amplified by PCR with SB168 and SB180 to generate the vector fragment. For pYqgE+, yqgE was amplified from MG1655 genomic DNA with SB178 and SB179. For pRFP+, RFP was amplified from pBbE2A-RFP63 with SB197 and SB198. Vector and inserts were assembled using the HiFi assembly kit (E2621L, New England Biolabs).

    pBbS6A-yqgE (also called ‘i-pYqgE+’ for ‘inducible pYqgE+’) was assembled by NEB HiFi assembly (E2621L) using pBbS6A-yqgE63 as backbone (amplified with SB212 and SB213) and pYqgE+ for the insert (amplified with SB180 and SB214).

    Strain construction

    hipA7 cells were constructed by transferring the hipA7 mutation from TH126931 to our MG1655 strain. All deletion strains were constructed using λ Red-mediated recombination65. The following primers were used to amplify the template from pKD4 for recombination: SB165 and SB166 (yqgE), SB183 and SB184 (lon), SB185 and SB186 (priA), SB191 and SB192 (sulA). For all strains except ΔpriA, the kanamycin resistance (kanr) cassette was removed with pCP20, which was subsequently lost after non-selective overnight growth at 42 °C. Strains were confirmed to have no remaining antibiotic resistance.

    Removal of kanr was not successful for MG1655-ΔpriA or metG*priA, as cultures did not grow at 42 °C. Instead, experiments in Extended Data Fig. 11a were done with the kanamycin marker still present. After outgrowths (Extended Data Fig. 11a), deletion of priA was confirmed again, and dnaC was checked for compensatory mutations66. ΔpriA strains were grown in M9 for cloning steps and then in SM9 for growth curves (Extended Data Fig. 11a).

    Bioreactor growth

    Ten litres of SM9 medium was prepared in a carboy. Around 160 ml were transferred into an autoclaved vessel (DASGIP) with 500 ml capacity. A silicone heater (GBH0250-1, BriskHeat) was used to bring the temperature of the medium to 37 °C, after which 100 μl of overnight culture was inoculated into the vessel, and medium flow into the vessel was turned on. An outflow pump maintained a constant level of medium. The culture was stirred with a stirrer bar at 500 rpm and air was flowed in at 0.1 l min−1. Medium flow rate was manually tuned to be above the E. coli doubling time. After >12 h, medium flow was turned off. Using a sampling port and syringe, samples were taken for OD600 measurement, antibiotic treatment, ScanLag, and/or PETRI-seq.

    Antibiotic survival assays

    To measure antibiotic tolerance, cells were incubated in SM9 containing 200 μg ml−1 ampicillin and/or 5 μg ml−1 ciprofloxacin for 4 h (unless otherwise noted) with shaking at 37 °C. For lag phase antibiotic survival, cells were taken either from the bioreactor or from an overnight culture and diluted 1:50 or 1:100 into SM9 plus antibiotics. When important, the length of the ‘overnight’ culture was noted (as in Fig. 5c or for 6-day stationary in Fig. 2b), but typical overnight cultures were grown for 16–24 h. For stationary phase antibiotic survival (‘undiluted’ in Extended Data Fig. 1a), antibiotics were added directly to the overnight culture. To count colonies after treatment, cells were pelleted, resuspended in PBS, and then plated on LB. CFUs were counted after 48 h and compared to CFUs before antibiotic treatment. Unless otherwise noted, replicates were biological replicates from distinct single colonies picked for overnight cultures.

    To assay survival ‘in tet’ or ‘after tet’ (Fig. 3a,b), overnight cultures were diluted into fresh medium containing 54 μg ml−1 tetracycline. Cultures were incubated in tetracycline for 30 min at 37 °C with shaking. Then, antibiotics were added (in tet), or cells were pelleted, washed twice in tetracycline-free fresh medium, then treated with antibiotics (after tet). Cells were kept in antibiotics for 4 h.

    To assay antibiotic survival after rifampicin (Extended Data Fig. 8b), overnight cultures were diluted into fresh medium containing 200 μg ml−1 rifampicin. Cultures were incubated for 30 min at 37 °C with shaking, then ampicillin or ciprofloxacin was added and incubated for 4 h (‘WT in rifampicin’). For comparison, an overnight culture was diluted into antibiotic-containing medium (without rifampicin, ‘WT’ on plot). To assay survival in rifampicin alone (Extended Data Fig. 8c), overnight cultures were diluted into fresh medium containing 200 μg ml−1 rifampicin and incubated for 1 h at 37 °C with shaking. CFUs were counted before and after rifampicin.

    Appearance time assays

    Cells were taken either from the bioreactor or from a standard overnight culture and ~100 CFU were spread on an LB or SM9 agar plate. Unless otherwise noted, replicates were measured from distinct single colonies picked for inoculation. To maximize reproducibility, all plates contained 25 ml of medium. Colony appearance times were not different between LB and SM9 plates. As detailed previously19, plates were put on a scanner (Epson V500 Photo) in the 37 °C incubator and scanned at 15-min intervals for 24–48 h.

    Scanners were controlled by ScanningManager software, and images were analysed using Matlab scripts previously published19. Appearance times were found using the appearance output of getAppearanceGrowth. Minimum colony size was set to 20 and maximum set to 100.

    PETRI-seq library preparation

    Growth conditions for all PETRI-seq samples are detailed in Supplementary Table 3.

    PETRI-seq of E. coli cells was carried out as detailed previously2. A stepwise protocol is available at: https://tavazoielab.c2b2.columbia.edu/PETRI-seq/. In brief, cells were pelleted and fixed overnight in 4% formaldehyde. The following day, cells were washed twice in PBS with RNase inhibitor (PBS-RI) and then resuspended in 50% ethanol in PBS-RI. In 50% ethanol, cells could be stored at −20 °C for at least 2 weeks. Cells were washed twice in PBS-RI to remove the ethanol and then permeabilized with lysozyme. Cells were washed twice again and then treated with DNase. After DNase inactivation, cells were washed twice in PBS-RI. As a stopping point, cells could then be resuspended in 50% ethanol in PBS-RI and saved at −20 °C for at least 2 weeks; then they were washed twice again in PBS-RI before resuming. To continue cell preparation, the cell pellet was resuspended in PBS-RI and counted using a haemocytometer. Split-pool barcoding, cell lysis, and second strand synthesis were performed as described, yielding 20 μl purified cDNA2. For tagmentation, EZ-Tn5 (TNP92110, Biosearch Technologies) was loaded by annealing SB117 and SB118 (Supplementary Table 6), diluting the oligonucleotides to 5 μM each in 50% glycerol, and then adding 2 μl EZ-Tn5 to 8 μl of the oligonucleotides. EZ-Tn5 was incubated with the oligonucleotides for 30 min at room temperature; loaded EZ-Tn5 was stored at −20 °C. 0.125 μl of loaded EZ-Tn5, 24.875 μl TD buffer (FC-131–1096, Illumina), and 5 μl water were added to 20 μl purified cDNA and incubated at 55 °C for 5 min then brought to 10 °C. 12.5 μl NT (FC-131–1096, Illumina) was immediately added to stop the reaction. Tagmented cDNA was amplified in a 500 μl PCR with Q5 polymerase (M0491L, New England Biolabs): 100 μl 5× buffer, 10 μl 10 mM dNTPs (N0447L, New England Biolabs), 5 μl Q5 polymerase, 85 μl Q5 High GC Enhancer, 0.5 μM N70x (Nextera Index Kit v2 Set A, TG-131-2001, Illumina; or equivalent from Integrated DNA Technologies), 0.5 μM i50x (E7600S, New England Biolabs; or equivalent from Integrated DNA Technologies). Libraries were amplified until the early exponential phase (~16–18 cycles): 72 °C 3 min; 95 °C 30 s; cycle: 95 °C 10 s, 55 °C 30 s, 72 °C 30 s; 72 °C 5 min. PCR reactions were pooled (if the 500 μl reaction had been split into multiple PCR tubes), and 100 μl was taken, purified twice with AMPure XP beads (A63881, Beckman Coulter), and eluted in 30 μl water. The resulting libraries could be sequenced directly (non-depleted) or rRNA-depleted using Cas9.

    rRNA depletion of PETRI-seq libraries by Cas9

    PETRI-seq libraries were subjected to rRNA depletion by the canonical Cas9::crRNA::tracrRNA tripartite complex67. To prepare tracrRNA, a dsDNA template (C2425) was made by PCR of pWJ4023 with Q5 polymerase and primers W2031 and W2032. Alternatively, C2425, which is 96 bases long, could be made by ordering and annealing complementary oligonucleotides. C2425 was used for T7 in vitro transcription with the TranscriptAid T7 High Yield Transcription Kit (K0441, Thermo Scientific) by combining the following in a 20 μl reaction: 4 μl 5× reaction buffer, 2 μl 100 mM ATP, 2 μl 100 mM CTP, 2 μl 100 mM GTP, 2 μl 100 mM UTP, 1 μl T7 RNAP enzyme, 700 ng C2425. The reaction was incubated at 37 °C for 4 h, during which a white precipitate became visible. 1 μl DNase I (AMPD1, Millipore Sigma) was added and incubated at 37 °C for an additional 45 min to digest the DNA template. RNA was purified using the Norgen Biotek Total RNA purification kit (37500, Norgen) to generate J703 (tracrRNA). To prepare crRNAs, 45 μM W2034 (T7 promoter) and 45 μM W2035-W2141 (separate reaction for each) were combined in annealing buffer (10 mM Tris pH 7.5, 50 mM NaCl, 1 mM EDTA), heated to 95 °C for 5 min then cooled to room temperature. 1 μl of annealed product was used for T7 in vitro transcription (K0441, Thermo Scientific) by adding the following: 4 μl 5× reaction buffer, 2 μl 100 mM ATP, 2 μl 100 mM CTP, 2 μl 100 mM GFP, 2 μl 100 mM UTP, 1 μl T7 RNAP enzyme, 6 μl water. The reaction was incubated at 37 °C for 4 h, during which a white precipitate became visible. One microlitre DNase I (AMPD1, Millipore Sigma) was added and incubated at 37 °C for an additional 45 min to digest the DNA template. Each resulting crRNA was purified using the Norgen Biotek Total RNA purification kit (37500, Norgen). To anneal tracrRNA to crRNA, 70 pmol tracrRNA (J703) and 70 pmol crRNA were combined in 10 μl of annealing buffer (10 mM Tris pH 7.5, 50 mM NaCl, 1 mM EDTA), heated to 95 °C for 5 min, then slowly cooled to room temperature to yield 7 pmol μl−1 tracrRNA::crRNA. All 59 annealed tracrRNA::crRNA were pooled in an equimolar ratio. rRNA was depleted by combining the following in a 50 μl reaction: 5 μl 10× reaction buffer (Z03386, GenScript), 0.74 μl tracrRNA::crRNA (5.18 pmol total tracrRNA::crRNA; 0.088 pmol of each), 10 μl Cas9 (Z03386, GenScript), 49–80 ng PETRI-seq library. The reaction was incubated at 37 °C for 90 min then purified twice with 1× AMPure beads. The library concentration was measured using the Agilent Bioanalyzer High Sensitivity DNA Kit (5067-4626, Agilent). Libraries were sequenced for 75 cycles (58 R1, 17 R2) using the NextSeq 500/550 High Output Kit v2.5 (20024906, Illumina). rRNA-depleted libraries were loaded at ~2× the recommended concentration to account for cleaved rRNA fragments without both Illumina adapters. Non-depleted libraries were loaded at ~1.5× the recommended concentration.

    Our rRNA depletion strategy is in theory very similar to DASH68, which amplifies cDNA after Cas9 cleavage. Further optimization could include testing the differences between these techniques.

    Fluorescence-activated cell sorting

    metG* cells were transformed with fluorescent transcriptional reporters for rmf, cysK and mdtK promoters64 (Supplementary Table 6). Overnight cultures were diluted 1:100 (rmf, cysK, dual markers) or 1:50 (mdtK) into SM9 then grown for 3.5 (rmf), 3.17 (cysK), 2.5 (mdtK), 5.8 (dual cysK/rmf), or 5.25 (dual cysK/mdtK) hours at which point they reached OD600 of 0.401 (rmf), 0.336 (cysK), 0.238 (mdtK), 0.35 (dual cysK/rmf), or 0.59 (dual cysK/mdtK). Cells were centrifuged at 5,000g for 5 min and then resuspended in PBS. Using an S3e Cell Sorter (12007058, Bio-Rad), cells were analysed, gated by forward scatter versus side scatter (Bio-Rad ProSort; Extended Data Fig. 3h,i), then sorted by GFP expression (high or low) into PBS. Sorted cells were counted (CFU), inoculated into antibiotic-containing SM9, and/or used for ScanLag. For the protein expression assay shown in Extended Data Fig. 3f,g, metG*pcysK-GFP cells were transformed with pBbA6C-RFP63 (35290, Addgene), which expresses RFP under the LlacO1 promoter. Overnight cultures were diluted 1:50 into SM9 + 500 µM IPTG then grown for 4.6 h (OD600 = 0.182). We noted that because of the stochasticity in metG* lag times, time to reach a particular OD600 after dilution from an overnight varied substantially by experiment. Cells were resuspended in PBS, analysed, gated by forward scatter versus side scatter (Extended Data Fig. 3h,i), then sorted by GFP and RFP expression into SM9 (Extended Data Fig. 3f). GFP- or RFP-only controls were used to compensate for overlapping emissions of GFP and RFP. Because the cells come out of the sorter in PBS, the final composition of medium was 71% SM9 in PBS. Cell density was too low to successfully pellet the cells and change medium. Cells were grown at 37 °C with shaking (300 rpm) and analysed at given timepoints over the next day. OD600 stayed below 0.01 for the duration of the experiment, likely reflecting high purity of cells with long lag times and possibly reduced growth rate from 29% PBS. To see RFP expression in persister cells (Extended Data Fig. 3g), populations were gated on high GFP (cysK+). To make Extended Data Fig. 3, FlowJo 10.8.1 was used. Distributions were plotted using the layout editor. metG* cells without a fluorescent protein expression vector were used to subtract background autofluorescence (for Extended Data Fig. 3g).

    Generating crRNA library with CALM

    E. coli crRNA libraries were generated using CALM, as previously described3 with one minor modification. C2185 (insert library) and C2184 (backbone) were assembled by Gibson reaction (E2621L, New England Biolabs) and transformed into MG1655 cells without pWJ445 (dCas9 plasmid). The library was grown in LB broth containing 50 μg ml−1 kanamycin at 37 °C until OD600 reached ~0.4 (about 4 h). The resulting library was pelleted and used to miniprep an assembled crRNA plasmid library, labelled M60. Sequencing of this library confirmed high coverage of the genome with each gene targeted on average by 56 unique crRNAs.

    CRISPRi screen sample collection

    Supplementary Table 5 includes details about each CRISPRi library. Generally, electrocompetent cells were prepared from the parental strain containing either pWJ445 (pTet-dCas9) or pBbS6C-dCas9 (pLlacO1-dCas9). Different inducers (IPTG or aTc) were used for replicates to avoid inducer-specific effects. ~200 ng of M60 (crRNA plasmid library) was electroporated with 50 μl of cells using the MicroPulser (Bio-Rad) set to the default E. coli program 1 (1 mm, 1.8 kV, 6.1 ms). Cells were recovered in 500 μl SOC medium for 1.5 h at 37 °C. A small volume was taken to count colonies on LB agar with or without selection antibiotics (kanamycin + chloramphenicol) in order to calculate transformation efficiency and ensure adequate library coverage. The crRNA library contains at most 500,000 crRNAs3, and transformations routinely yielded >100 million transformants. The remaining volume of recovered cells was transferred to a flask containing 225 ml SM9 with kanamycin + chloramphenicol. For YqgE/RFP overexpression screens (SBC308-SBC315; Supplementary Table 5), 500 μM IPTG was added 2 h later to induce yqgE or RFP. For all libraries, cells were grown at 37 °C until they reached an OD600 of ~0.4 (3–4 h). The resulting library (L) was divided for either late exponential/stationary dCas9 induction before lag phase assays or exponential dCas9 induction before exponential assays. Samples were also taken from L for SBC95 and SBC125 (Supplementary Table 5). Late exponential/stationary induction allowed for minimal loss of essential crRNAs (area under the curve = 0.48–0.52 for wild-type/metG* before and after induction), so all genes could be assayed in lag phase.

    Lag phase assays

    For lag phase assays, 20 ml of cell library (L) was pelleted and resuspended in 20 ml SM9 containing kanamycin, chloramphenicol, and either anhydrotetracycline (aTc; 20 nM) or isopropyl-β-d-thiogalactopyranoside (IPTG; 500 μM). Centrifugation likely was not necessary here but was done every time for consistency. The culture was grown overnight (14–22 h) to induce dCas9 and reach stationary phase. The next day, cells were pelleted and resuspended in the same volume of SM9 without inducer or antibiotics. Samples were taken for SBC96, SBC126, SBC191, SBC205, SBC316, SBC308, SBC310, SBC312 and SBC314 (Supplementary Table 5).

    For lag outgrowth, resuspended cells were either inoculated into 500 ml semisolid SM9 (~80 million cells per litre for SBC101, SBC131, SBC210, SBC318; ~2 × 109 cells per litre for SBC100, SBC130) or diluted 100× into SM9 broth (SBC102, SBC132, SBC192, SBC309, SBC311, SBC313, SBC315). Outgrowth times are shown in Supplementary Table 5.

    For lag antibiotic treatment, resuspended cells were diluted 100× into SM9 containing 200 μg ml−1 ampicillin and 5 μg ml−1 ciprofloxacin and incubated for the indicated amount of time (Supplementary Table 5). After antibiotic treatment, cells were pelleted, washed in PBS, then resuspended in SM9 and inoculated into 500 ml semisolid SM9 medium at a density ~50 × 106 cells per litre. After 2 days, cell samples were collected for SBC98, SBC99, SBC128, SBC129, SBC207, SBC208, SBC209 and SBC317. Semisolid medium was used to minimize interclone competition.

    Exponential assays

    For exponential assays, dCas9 was induced in exponential phase, and cells were not grown overnight to stationary phase. The cell library (L) was diluted 200x into SM9 containing kanamycin, chloramphenicol, and 20 nM aTc or 500 μM IPTG. Cells were grown for 3.5–4.5 h (OD600 = ~0.2-0.4). Samples were taken for SBC133 and SBC211 (Supplementary Table 5) and also diluted 100× into SM9 containing kanamycin, chloramphenicol, and 20 nM aTc or 500 μM IPTG. These cells were grown for 3–4 h then sampled for SBC134 and SBC212.

    CRISPRi library preparation

    Collected cell samples (described above) were pelleted and miniprepped (Qiagen). 400 ng of DNA were amplified in a 60 μl PCR with Q5 polymerase (M0491L, New England Biolabs), 0.5 μM of forward primer (equimolar mixture of W1397, W1398, W1399, W1400), and 0.5 μM of reverse primer (W1699). The reaction was thermocycled as follows: 98 °C 30 s; 10× 98 °C 10 s, 55 °C 20 s, 72 °C 30 s; 72 °C 2 min. PCR products were purified by double-sided AMPure cleanup (left-side ratio = 0.8×; right-side ratio = 1.4×) then eluted in 40 μl H2O. 2.5 μl of purified DNA was used for a second PCR in 100 μl using Q5 polymerase, 0.5 μM forward primer (CRISPRi_PCR_2_F; Supplementary Table 6), and 0.5 μM reverse primer (CRISPRi_PCR_2_R; Supplementary Table 6). The reaction was thermocycled as follows: 98 °C 30 s; 6× 98 °C 10 s, 55 °C 20 s, 72 °C 30 s; 72 °C 2 min. PCR products were purified by two AMPure cleanups (first 0.9×, then 0.8×) and eluted in 30 μl. The library concentration was measured using the Agilent Bioanalyzer High Sensitivity DNA Kit (5067-4626, Agilent). Libraries were sequenced for 75 cycles using the NextSeq 500/550 High Output Kit v2.5 (20024906, Illumina). Single-end reads are ideal because only Read 1 is useful for mapping crRNAs. However, depending on the forward primer used for PCR 2, as few as 58 cycles can be allocated to read 1.

    Antibiotic susceptibility with bortezomib

    Bortezomib (5043140001, Millipore Sigma) stock was prepared by dissolving in DMSO. For the assays in Extended Data Fig. 11d,e, single colonies were picked into SM9 containing 1% DMSO and 100 μM bortezomib. Control (– bzmb) cultures were started in the same way but SM9 contained 1% DMSO and no bortezomib. After overnight culture, antibiotic survival was assayed as described in ‘Antibiotic survival assays’, but for lag phase assays, antibiotic-containing SM9 was supplemented with 1% DMSO ± 100 μM bortezomib.

    Lag times after bortezomib treatment

    For full growth with bortezomib (dotted black line in Fig. 5b), single colonies of metG*sulA cells were picked into 1 ml SM9 containing 1% DMSO and 100 μM bortezomib (5043140001, Millipore Sigma). For bortezomib addition during stationary phase (dotted pink line in Fig. 5b), single colonies of metG*sulA cells were picked into 1 ml SM9. After 20 h, bortezomib was added to 100 μM. For bortezomib treatment during lag phase (dotted green line in Fig. 5b) or not at all (control; grey line in Fig. 5b), single colonies of metG*sulA cells were picked into 1 ml SM9 containing 1% DMSO. After 24 h of growth, all cultures were diluted 100× into SM9 containing either 100 μM bortezomib and DMSO (green line in Fig. 5b) or only 1% DMSO (all other samples). Cells were grown on a plate reader (37 °C with continuous shaking; PowerWave XS2, BioTek) and OD600 measured at 10-min intervals.

    Stationary phase translation assay

    E. coli cells of the indicated genotype (Fig. 5f) containing pBbS6C-RFP were grown for 24 h in 1 ml SM9. Two-hundred microlitres of overnight culture were transferred to a 96-well plate and supplemented with 2.5 mM IPTG. Cells were grown on a plate reader (37 °C with continuous shaking; Synergy Neo2, BioTek). RFP (570 excitation, 620 emission) and OD600 were measured at 10-min intervals.

    metG complementation of metG* mutation

    Wild-type or metG* cells carrying either pBbS6C-metG or pBbS6C-metG* (Supplementary Table 6) were grown overnight in 1 ml SM9 containing chloramphenicol. Overnight cultures were diluted 1,000× into SM9 containing chloramphenicol and 500 μM IPTG. These cultures were grown overnight again. The following day, lag phase antibiotic survival was assayed with ampicillin and ciprofloxacin (Extended Data Fig. 9d).

    Quantitative proteomics

    Overnight cultures of 3 colonies each of MG1655, metG*, and metG*-Δlon-ΔsulA were grown in 1 ml SM9 for ~19 h at 37 °C. Stationary samples were taken directly from the overnight cultures. For wild-type (MG1655) exponential cells (Extended Data Fig. 7), MG1655 overnight cultures were diluted 200× into fresh SM9 and grown for 90 min (final OD600 = 0.2–0.23). For metG* lag/persister cells (Extended Data Fig. 7), metG* overnight cultures were diluted 100× into fresh SM9 and grown for 30 min. OD600 did not increase in that 30 min (replicate 1: ODinitial = 0.069, OD30min = 0.065; replicate 2: ODinitial = 0.07, OD30min = 0.065; replicate 3: ODinitial = 0.072, OD30min = 0.066).

    Cells were collected in Eppendorf tubes and washed twice with ice-cold PBS. Cells were then lysed in lysis buffer containing 8 M urea, 0.1 M ammonium bicarbonate, and protease inhibitors (1 mini-Complete EDTA-free tablet). The lysate was cleared by centrifugation at 14,000 rpm for 30 min at 4 °C. The supernatant was transferred to a new tube, and the protein concentration was determined using a BCA assay (Pierce). Subsequently, 10 µg of total protein was subjected to disulfide bond reduction with 10 mM DTT (at 56 °C for 30 min) followed by alkylation with 10 mM iodoacetamide (at room temperature for 30 min in the dark). Excess iodoacetamide was quenched with 5 mM DTT (at room temperature for 15 min in the dark). Samples were then diluted sixfold with 50 mM ammonium bicarbonate and digested overnight at 37 °C with a trypsin/Lys-C mix (1:100). The next day, digestion was stopped by the addition of 1% TFA (final v/v), followed by centrifugation at 14,000g for 10 min at room temperature to pellet precipitated lipids. Cleared digested peptides were desalted on an SDB-RPS Stage-Tip disk69 and dried down in a speed-vac. Peptides were resuspended in 10 µL of 3% acetonitrile/0.1% formic acid and injected onto a Thermo Scientific Orbitrap Fusion Tribrid mass spectrometer using a DIA method for peptide MS/MS analysis.

    The UltiMate 3000 UHPLC system coupled with an EASY-Spray PepMap RSLC C18 column was used to separate fractionated peptides with a gradient of 5–30% acetonitrile in 0.1% formic acid over 90 min at a flow rate of 300 nl min−1. After each gradient, the column was washed with 90% buffer B for 10 min and re-equilibrated with 98% buffer A (0.1% formic acid, 100% HPLC-grade water) for 30 min. Survey scans of peptide precursors were performed from 350–1,200 m/z at 120 K FWHM resolution with a 1 × 106 ion count target and a maximum injection time of 60 ms. The instrument was set to run in top speed mode with 3-s cycles for the survey and MS/MS scans. After a survey scan, 26 m/z DIA segments were acquired from 200–2,000 m/z at 60 K FWHM resolution with a 1 × 106 ion count target and a maximum injection time of 118 ms. HCD fragmentation was applied with 27% collision energy, and resulting fragments were detected using the rapid scan rate in the Orbitrap. The spectra were recorded in profile mode.

    DIA data were analysed with the MaxDIA software platform within the MaxQuant software environment using a library-free approach70. The search was set up with the reference E. coli proteome database downloaded from UniProt. The false discovery rate (FDR) was set to 1% at the peptide precursor level and 1% at the protein level. Results obtained from MaxQuant were further analysed using the standard pipeline for differential analysis with the DEP package71. Proteins were filtered for inclusion in 2 out of 3 replicates of at least one condition. Data was normalized by variance stabilizing transformation. Missing data was imputed using the MinProb method with q = 0.01. Significantly enriched proteins were defined by alpha = 0.05 and lfc = log2(1.5) (Supplementary Table 2). For the principal components analysis (PCA) in Extended Data Fig. 7c, LFQ intensity (for included samples) was log-transformed and scaled with StandardScaler to centre each protein with mean of 0 and s.d. of 1. Principal components were calculated from all proteins using sklearn72. See next section for PCA and UMAP in Extended Data Fig. 7a,b.

    PETRI-seq analysis

    Barcode demultiplexing was carried out as previously described2 with the following minor modification73: before extracting the unique molecular identifier (UMI) sequence, PEAR74 was used to merge reads 1 and 2 when they overlapped. Only non-overlapping reads were carried forward because read 2 should contain cDNA sequence, and the end of read 1 should contain barcode 1. Note that this may not apply when sequencing more than 75 cycles. Also, read 2 was trimmed if it matched the reverse complement of the end of read 1, an artefact we think occurs due to hairpin formation. The full pipeline uses trimmomatic75 (v0.33) to filter reads, Cutadapt76 (v1.18) to demultiplex, UMI-tools77 (v0.5.5) to extract UMIs, bwa78 (v0.7.17) to align, and featureCounts79 (v1.6.3) to annotate features.

    Seurat (version 4.1.1)80 was used for normalization, dimensionality reduction, and clustering of PETRI-seq data. In brief, the matrices produced by demultiplexing and UMI collapsing were read into a Seurat object. All MG1655 samples in this study (Supplementary Table 3) were combined in the same Seurat object. For Extended Data Fig. 6g–j, a new Seurat object was made with all MG1655 cells plus CFT073 cells; accessory genes only in the CFT073 genome were omitted. For all analysis, rRNA counts were excluded except for Extended Data Fig. 8. Barcodes were filtered for more than 9 and fewer than 1,000 mRNA UMIs. All cells were then downsampled to 38 UMIs using the SampleUMI function (max.umi = 38, upsample = FALSE). UMI counts were log-normalized using the geometric mean of all cell UMI counts as a scale factor. Gene counts were scaled and centred to a mean of 0 and s.d. of 1 (Seurat ScaleData). Principal components were calculated with all genes. For the full cell atlas (Fig. 1f), principal components 1–10 were used to compute UMAP81 coordinates (default parameters) and to find neighbouring cells. Clusters were found using default parameters82 (Louvain algorithm) at resolution 0.32. For hipA7 cells alone (Extended Data Fig. 5a), principal components 1–5 were used to find neighbouring cells, and clusters were found at resolution 0.1. For extended stationary (6-day) wild-type cells with metG* and standard wild-type cells (Extended Data Fig. 5f), principal components 1–6 were used to find neighbouring cells, and clusters were found at resolution 0.16. For the full atlas downsampled to ~30 mRNA UMIs (Extended Data Fig. 5i), cells were downsampled as described with max.umi = 30. Then, only cells with exactly 29 or 30 mRNA UMIs were kept in the Seurat object. Cells were processed and clustered as with the full atlas (10 principal components, resolution = 0.34). For CFT073 cells alone (Extended Data Fig. 6d,e), CFT073 accessory genes were included, principal components 1–10 were used to find neighbouring cells, and clusters were found at resolution 0.31. For CFT073 with MG1655 cells (Extended Data Fig. 6g), principal components 1–10 were used to find neighbouring cells, and clusters were found at resolution 0.38.

    To project proteomics samples with scRNA-seq (Extended Data Fig. 7a,b), proteomics samples were log-normalized using the geometric mean of the scRNA-seq library. Proteomics samples were then merged into a single Seurat object with downsampled, log-normalized scRNA-seq data. This entire Seurat object was scaled and centred with ScaleData. For the PCA (Extended Data Fig. 7a), loadings were extracted from the scRNA-seq Seurat object and used to project all cells and proteomic samples. For UMAP (Extended Data Fig. 7b) and clustering (Extended Data Fig. 7a,b), principal components 1–6 were used with n.neighbors = 50 and k.param = 50. Clusters were found at resolution 0.32. If principal components 1–10 and default n.neighbors and k.param are used (as with scRNA-seq alone), then the stationary and lag proteomes form their own cluster; exponential proteomes still cluster with early exponential transcriptomes.

    When the full cell atlas is shown or analysed (Fig. 1 and Extended Data Figs. 1n–u and 5i), only cell samples relevant up to that point in the text are shown or included in expression analysis, but all cells (as listed in Supplementary Table 3) were used to compute principal components, UMAP coordinates, and cell clusters. See Supplementary Table 3 for details of which cell samples are included in each figure.

    For Extended Data Fig. 8, which defines transcriptional deficiency, different thresholds were used to retain cells with very low mRNA counts. Specifically, in Extended Data Fig. 8f,g, all cells with total RNA above a library-specific threshold (between 16–64 total UMIs) were retained. By contrast, Fig. 2f includes only cells with at least 10 mRNAs, as these are the cells used for UMAP and clustering. rRNA depletion is also important to consider when defining transcriptional deficiency (Extended Data Fig. 8d–f). Extended Data Fig. 8d,e only shows libraries that were not subjected to rRNA depletion. In Extended Data Fig. 8f, all libraries are included with a slightly different threshold used for depleted or non-depleted libraries.

    Differential expression analysis from scRNA-seq

    To find genes differentially expressed between cell clusters or pre-defined populations, a custom pipeline combining edgeR83 and Seurat’s FindMarkers tool was used. EdgeR was used with TMM normalization to calculate log2(fold change) from pseudobulk samples. Pseudobulk samples are calculated by summing all counts from all single cells of a given population; single-cell transcriptomes are taken before downsampling. For P values, limma’s84 rankSumTestWithCorrelation (the default for Seurat’s FindMarkers; two-sided Wilcoxon–Mann–Whitney) was used with downsampled, log-transformed single-cell data as input. Using downsampled cells for significance testing gives the result most consistent with the centred edgeR data. Total UMI counts by sample (before and after downsampling) are provided in Supplementary Table 4.

    CRISPRi analysis

    CRISPRi sequencing reads were aligned to reference genomes for E. coli then to S. aureus (used for library manufacturing3). Functional spacers were identified as described3 based on presence of an “NGG” PAM sequence. Only functional E. coli spacers were used for downstream analysis.

    For lag and exponential comparisons, spacer abundance post-outgrowth was compared to pre-outgrowth. For lag antibiotic treatment, spacer abundance post-antibiotics plus outgrowth was compared to after outgrowth only. For simplicity, consider post-antibiotics as a “post” condition relative to outgrowth only (‘pre’) in the description below.

    To compare CRISPRi libraries, spacers were filtered to remove any position with fewer than 10 reads in both post and pre libraries. Then, the frequency of each spacer was calculated by dividing the number of reads for that spacer by the total number of reads in the library. A pseudocount of 0.99 was added to spacers with 0 counts. Based on the assumption that spacers targeting intergenic regions outside of promoters would not affect phenotypes, we used these intergenic spacers to normalize spacer abundance in both pre and post libraries. All spacer frequencies were normalized as follows (for exemplified spacer labelled A):

    $${{\rm{enrichment}}}_{{\rm{A}}}={\log }_{2}{({{\rm{spacer}}}_{{\rm{A}}}/{{\rm{GM}}}_{{\rm{null}}})}_{{\rm{post}}}-{\log }_{2}{({{\rm{spacer}}}_{{\rm{A}}}/{{\rm{GM}}}_{{\rm{null}}})}_{{\rm{pre}}}$$

    where GMnull is the geometric mean of the frequencies of all null (intergenic) spacers.

    To calculate gene enrichment scores, mean enrichment scores for spacers aligned within or directly upstream of each gene were calculated. The number of spacers mapping to each gene varied, which was important for computing gene enrichment P values. The null distribution of enrichment scores for intergenic spacers was randomly sampled to generate pseudogenes with n spacers. This was repeated to generate 100,000 simulated replicates for every relevant n. To assign a P value, each gene enrichment score was compared to a simulated null distribution with the same number of spacers as included for that gene. Significantly enriched or depleted genes were found based on a FDR of 0.1 using the Benjamini–Hochberg method85. In most cases, significant genes were further filtered by significance in multiple replicates. For hipA7, only one replicate of each screen was done. In Fig. 4b, we wanted to highlight top hits, so we used Bonferroni correction to threshold only the most significant hits. In other figures, we used FDR of 0.1 for the hipA7 replicates.

    For each gene in the CRISPRi screens, enrichment and significance were calculated independently for crRNAs targeting the antisense or sense strand; the strand with strongest effect (by significance then enrichment score) is determined and included in each relevant figure. For Fig. 4b,c, strand is noted in source data. In Fig. 4d and Extended Data Fig. 10, the strand shown is antisense unless otherwise noted. To assess depletion of essential genes (Extended Data Fig. 9f,g), we used a stringent set of genes found to be essential in all of four previous datasets86.

    Pathway enrichment with iPAGE

    To find pathways significantly correlated with principal component 1 or 2 of the cell atlas (Extended Data Fig. 1o), we divided the principal component loadings into high (greater than 0.025) and low (less than −0.025) groups. We ran iPAGE87 in discrete mode (up, down) with maximum P value of 0.001 and independence = 0. Redundant pathways were filtered manually, and representative ones are shown.

    To find genes enriched in either the early lag or the persister cluster (Extended Data Figs. 1p–u and 5k,l), differential expression analysis was performed as described. Using the Benjamini–Hochberg method85, an FDR of 0.01 was applied to select significantly over- and under-represented genes. These significance scores were used as input for iPAGE87, which was run in discrete mode (up, down, neutral) with maximum P value of 0.05 and independence = 0. Pathways were then filtered further by P values indicated in figure legends. Redundant pathways and those indicating enrichment of a single operon were filtered manually; representative terms are shown. To compute mean expression, the AverageExpression function in Seurat was used, and mean gene expression values were averaged for all genes in a given set.

    To find genes enriched in each persister type versus early exponential cells (Extended Data Figs. 2 and 6k) differential expression analysis was performed as described. Using the Benjamini–Hochberg method85, an FDR of 0.01 was applied to select significantly over- and under-represented genes. These significance scores were used as input for iPAGE87, which was run in discrete mode (up, down, neutral) with maximum P value of 0.1 and independence = 0. Pathways shown in Extended Data Fig. 2a–c(iv) are the top (P < 0.0005) non-redundant gene sets overexpressed in each persister type; when these pathways are also significant for another persister type, they are also labelled in that panel (*P < 0.05, **P < 0.005, ***P < 0.0005).

    To find genes enriched in persister cell groups versus tetracycline-treated cells (Figs. 3c and 4e and Extended Data Figs. 9a,b,e,i–n and 10a,b), differential expression analysis was performed as described. An FDR of 0.05 was applied85 to select over- and under-represented genes. These significance scores were used as input for iPAGE87, which was run in discrete mode (up, down, neutral) with maximum P value of 0.05 and independence = 0. To select pathways to show in Fig. 3c and Extended Data Fig. 9b, loadings of principal component 1 were also used as input for iPAGE (continuous mode, 8 bins, max_p = 0.005, independence = 0). The intersection of gene sets enriched in the top PC1 bin (P < 0.001) and gene sets over-represented in at least one persister type (P < 0.05; based on marker genes versus tetracycline-treated cells) are shown in Fig. 3c and Extended Data Fig. 9b. Redundant pathways were manually filtered.

    To find genes enriched after CRISPRi perturbation (Extended Data Fig. 10c–e), an FDR of 0.1 was applied85 using P values (computed as described above from null distribution) to select over- and under-represented genes. These significance scores were used as input for iPAGE87, which was run in discrete mode (up, down, neutral) with maximum P value of 0.05 and independence = 0. Gene sets were subsequently filtered by P value < 0.01. For each gene, antisense- and/or sense-targeting crRNAs can be significant. For this analysis, only one strand was used for input; if one or both were significant in a single direction, the gene was assigned that direction, but if the antisense and sense cRNAs were significant in opposite directions, then the one with the higher enrichment score (absolute value) was used. Pathways significantly enriched in >3 of 5 metG* replicates, >1 of 3 wild-type replicates, or in 1 hipA7 replicate are shown in Extended Data Fig. 10c–e.

    To find pathways enriched in proteomics data (Extended Data Fig. 12a–d), differential protein analysis was performed as described with DEP. Fold changes were used as input for iPAGE87, which was run in continuous mode with 5 bins and maximum P value of 0.05.

    Identification of E. coli gene homologues

    The proteins of E. coli K12 (n = 4,136) were downloaded from Biocyc88 version 25.1. To identify potential homologues, E. coli proteins were searched against genomes of diverse microbial organisms. A total of 2,421 genomes downloaded from JGI IMG were included in the search89. These genomes were selected based on quality (High Quality = “Yes” in IMG portal) and optimization for biodiversity. They represent 39 phylum, 68 classes and 168 orders (Supplementary Table 7) based on GTDB taxonomic classification90. The protein search was done using DIAMOND91 under specific parameters: “blastp -e 1e-10 -k 10000000 –query-cover 66 –subject-cover 50 -b8 -c1”. Protein hits with maximum e-value equal to E-10 were kept as potential homologues for downstream analysis. For each protein, the number of genomes with homologues was counted and converted to frequency, shown in Extended Data Fig. 11q.

    Reporting summary

    Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

    [ad_2]

    Source link

  • Compensatory evolution in NusG improves fitness of drug-resistant M. tuberculosis

    [ad_1]

    Bacterial strains

    Mtb strains are derivatives of H37Rv unless otherwise noted. ΔbioA Mtb was obtained from the Schnappinger laboratory64. E. coli strains are derivatives of DH5α (NEB), Rosetta2, or BL21(DE3) (Novagen).

    Mycobacterial cultures

    Mtb was grown at 37 °C in Difco Middlebrook 7H9 broth or on 7H10 agar supplemented with 0.2% glycerol (7H9) or 0.5% glycerol (7H10), 0.05% Tween-80, 1× oleic acid-albumin-dextrose-catalase (OADC) and the appropriate antibiotics (kanamycin 10–20 μg ml−1 and/or hygromycin 25–50 μg ml−1). ATc was used at 100 ng ml−1. Mtb cultures were grown standing in tissue culture flasks (unless otherwise indicated) with 5% CO2. Note that both 7H9 and 7H10 medium are normally supplemented with biotin (0.5 mg l−1; ~2 μM), thereby allowing growth of the ΔbioA Mtb auxotroph.

    Selection of Rif-resistant Mtb isolates

    For the selection of RifR H37Rv and ΔbioA Mtb, 5 independent 5-ml cultures were started at a density of ~2,000 cells per ml (to minimize the number of preexisting RifR bacteria) and grown to stationary phase (OD600 > 1.5). Cultures were pelleted at 4,000 rpm for 10 min, resuspended in 30 μl remaining medium per pellet and plated on 7H10 agar supplemented with Rif at 0.5 μg ml−1. After outgrowth, colonies were picked into 7H9 medium. After 1 week of outgrowth, an aliquot was heat-inactivated and the Rif resistance determining region of rpoB, rpoA and rpoC were amplified by PCR and Sanger sequenced. See Supplementary Table 4 for primer sequences.

    Generation of structural models

    The structural model of Mtb RNAP transcription initiation complex bound to Rif in Fig. 1a was generated by modelling Mycobacterium smegmatis RNAP bound to Rif (PDB: 6CCV)65 on to the transcription initiation complex structure (PDB: 6EDT)66.

    The cryo-EM structures of a NusG-bound paused elongation complex from Mtb (PDB: 8E74) in Fig. 2d, and the location of clinical isolate mutations in Fig. 4a are derived from Delbeau et al.13.

    Generation of individual CRISPRi strains

    Individual CRISPRi plasmids were cloned as described67 using Addgene plasmid 166886. In brief, the CRISPRi plasmid backbone was digested with BsmBI-v2 (NEB R0739L) and gel-purified. sgRNAs were designed to target the non-template strand of the target gene open reading frame (ORF). For each individual sgRNA, two complementary oligonucleotides with appropriate sticky end overhangs were annealed and ligated (T4 ligase NEB M0202 M) into the BsmBI-digested plasmid backbone. Successful cloning was confirmed by Sanger sequencing.

    Individual CRISPRi plasmids were then electroporated into Mtb. Electrocompetent cells were obtained as described68. In brief, an Mtb culture was expanded to an OD600 = 0.4–0.6 and treated with glycine (final concentration 0.2M) for 24 h before pelleting (4,000g for 10 min). The cell pellet was washed three times in sterile 10% glycerol. The washed bacilli were then resuspended in 10% glycerol in a final volume of 5% of the original culture volume. For each transformation, 100 ng plasmid DNA and 100 μl electrocompetent mycobacteria were mixed and transferred to a 2 mm electroporation cuvette (Bio-Rad 1652082). Where necessary, 100 ng plasmid plRL19 (Addgene plasmid 163634) was also added. Electroporation was performed using the Gene Pulser X cell electroporation system (Bio-Rad 1652660) set at 2,500 V, 700 Ω and 25 μF. Bacteria were recovered in 7H9 for 24 h. After the recovery incubation, cells were plated on 7H10 agar supplemented with the appropriate antibiotic to select for transformants.

    CRISPRi library transformation

    CRISPRi libraries were generated as described previously28. In brief, fifty transformations were performed to generate RifS and βS450L ΔbioA libraries. For each transformation, 1 μg of RLC12 plasmid DNA was added to 100 μl electrocompetent cells. The cells:DNA mix was transferred to a 2 mm electroporation cuvette (Bio-Rad 1652082) and electroporated at 2,500 kV, 700 Ω, and 25 μF. Each transformation was recovered in 2 ml 7H9 medium supplemented with OADC, glycerol and Tween-80 (100 ml total) for 16–24 h. The recovered cells were collected at 4,000 rpm for 10 min, resuspended in 400 μl remaining medium per transformation and plated on 7H10 agar supplemented with kanamycin (see ‘Mycobacterial cultures’) in Corning Bioassay dishes (Sigma CLS431111-16EA).

    After 21 days of outgrowth on plates, transformants were scraped and pooled. Scraped cells were homogenized by two dissociation cycles on a gentleMACS Octo Dissociator (Miltenyi Biotec 130095937) using the RNA_01 program and 30 gentleMACS M tubes (Miltenyi Biotec 130093236). The library was further declumped by passaging 1 ml of homogenized library into 100 ml of 7H9 supplemented with kanamycin (see Mycobacterial cultures) for between 5 and 10 generations. Final RifS and βS450L ΔbioA Mtb library stocks were obtained after passing the cultures through a 10-μm cell strainer (Pluriselect SKU 43-50010-03). Genomic DNA was extracted from the final stocks and library quality was validated by deep sequencing (see ‘Genomic DNA extraction and library preparation for Illumina sequencing’).

    Pooled CRISPRi screen

    Pooled CRISPRi screens were performed as described28. In brief, 20-ml cultures were grown in vented tissue culture flasks (T-75; Falcon 353136) and 7H9 medium supplemented with kanamycin (see ‘Mycobacterial cultures’) and maintained at 37 °C, 5% CO2 in a humidified incubator.

    The screen was initiated by thawing 4× 1-ml aliquots of the Mtb ΔbioA (RifS or βS450L) CRISPRi library (RLC12) and inoculating each aliquot into 24 ml 7H9 medium supplemented with kanamycin in a T-75 flask (starting OD6000.06). The cultures were expanded to approximately OD600 = 1.0, pooled and passed through a 10-μm cell strainer (pluriSelect 43-50010-03) to obtain a single cell suspension. The single cell suspension (flow-though) was used to set up six ‘generation 0’ cultures: three replicate cultures with ATc (+ATc) and three replicate control cultures without ATc (–ATc). From each generation 0 culture, we collected 10 OD600 units of bacteria (3 × 109 bacteria; 30,000X coverage of the CRISPRi library) for genomic DNA extraction. The remaining culture volume was used to initiate the pooled CRISPRi fitness screen. Cultures were periodically passaged in pre-warmed medium in order to maintain log phase growth. At generation 2.5, 5, and 7.5, cultures were back-diluted 1:6 (to a starting OD600 = 0.2) and cultivated for approximately 2.5 doublings. At generation 10, 15, 20, and 25, cultures were back-diluted 1:24 (to a starting OD600 = 0.05) and expanded for 5 generations before reaching late-log phase. ATc was replenished at every passage. By keeping the OD600 of the 20 ml cultures ≥ 0.05, we guaranteed sufficient coverage of the library (3,000X) at all times. At set time points (approximately 2.5; 5; 7.5; 10; 15; 20; 25 and 30 generations), we collected bacterial pellets (10 OD600 units) to extract genomic DNA.

    Genomic DNA extraction and library preparation for Illumina sequencing of CRISPRi libraries

    Genomic DNA was isolated from bacterial pellets using the CTAB-lysozyme method described previously69. Genomic DNA concentration was quantified using the DeNovix dsDNA high sensitivity assay (KIT-DSDNA-HIGH-2; DS-11 Series Spectrophotometer/Fluorometer).

    Illumina libraries were constructed as described28. In brief, the sgRNA-encoding region was amplified from 500 ng genomic DNA using NEBNext Ultra II Q5 master Mix (NEB M0544L). PCR cycling conditions were: 98 °C for 45 s; 17 cycles of 98 °C for 10 s, 64 °C for 30 s, 65 °C for 20 s; 65 °C for 5 min. Each PCR reaction a unique indexed forward primer (0.5 μM final concentration) and a unique indexed reverse primer (0.5 μM) (Supplementary Table 4). Forward primers contain a P5 flow cell attachment sequence, a standard Read1 Illumina sequencing primer binding site, custom stagger sequences to ensure base diversity during Illumina sequencing, and unique barcodes to allow for sample pooling during deep sequencing. Reverse primers contain a P7 flow cell attachment sequence, a standard Read2 Illumina sequencing primer binding site, and unique barcodes.

    Following PCR amplification, each 230 bp amplicon was purified using AMPure XP beads (Beckman–Coulter A63882) using two-sided selection (0.75X and 0.12X). Eluted amplicons were quantified with a Qubit 2.0 Fluorometer (Invitrogen), and amplicon size and purity were quality controlled by visualization on an Agilent 4200 TapeStation (Instrument- Agilent Technologies G2991AA; reagents- Agilent Technologies 5067-5583; tape- Agilent Technologies 5067-5582). Next, individual PCR amplicons were multiplexed into 20 nM pools and sequenced on an Illumina sequencer according to the manufacturer’s instructions. To increase sequencing diversity, a PhiX spike-in of 2.5–5% was added to the pools (PhiX sequencing control v3; Illumina FC-110-3001). Samples were run on the Illumina NextSeq 500 or NovaSeq 6000 platform (single-read 1 ×85 cycles, 8 × i5 index cycles, and 8 × i7 index cycles).

    Differential vulnerability analysis of Rif-resistant versus Rif-sensitive strains

    Gene vulnerability in the RifS and βS450L Mtb strains was determined using an updated vulnerability model based on the one previously described28. In the updated model, read counts for a given sgRNA in the minus ATc conditions were modelled using a negative binomial distribution with a mean proportional to the counts in the plus ATc condition, plus a factor representing the log2 fold change:

    $${y}_{i}^{-{\rm{ATc}}} \sim {\rm{NegBinom}}\left({\eta }_{i},\phi \right)$$

    $${\eta }_{i}=\log (\,{y}_{i}^{+{\rm{ATc}}}+{\lambda }_{i})+{\rm{TwoLine}}({x}_{i},{\alpha }_{l},{\beta }_{l},\gamma ,{\beta }_{e})$$

    where λi is an sgRNA-level correction factor estimated by the model, xi represents the generations analysed for the ith guide, and the TwoLine function represents the piecewise linear function previously described, which models sgRNA behaviour over the logistic function describing gene-level vulnerabilities was simplified by setting the top asymptote of the curve (previously K) equal to 0, representing the fact that weakest possible sgRNAs are expected to impose no effect on bacterial fitness, that is:

    $${\rm{Logistic}}\left(s\right)=\frac{{\beta }_{\max }}{\left(1+{{\rm{e}}}^{\left(-H\cdot \left(s-M\right)\right)}\right)}$$

    The Bayesian vulnerability model was run for each condition independently, and samples for all the parameters were obtained using Stan running 4 independent chains with 1,000 warmup iterations and 3,000 samples each (for a total of 12,000 posterior samples for each parameter in the model after discarding warmup iterations).

    Differential vulnerabilities were estimated by two approaches. First, for each gene, the difference in pairwise (guide-level) vulnerability estimates was obtained, resulting in posterior samples of the differential vulnerability (delta-vulnerability). This effectively estimated the difference in the integrals of the vulnerability functions. If the 95% credible region did not overlap 0.0 those were taken as significant differential vulnerabilities between the strains.

    Next, to identify differences between genes which may not exhibit the expected dose–response curve, we estimated the fitness cost (log2FC) predicted by our model for a (theoretical) sgRNA of strength 0.0 (that is, Logistic(s = 0)). This represented the weakest phenotype theoretically possible with our CRISPRi system, which we call Fmin. The difference between this value was estimated for each gene (∆Fmin) and those where the 95% credible region did not overlap 0.0 were identified as significant differential vulnerabilities by this approach.

    Pathway analysis

    First, all annotated Mtb genes were associated with a pathway as defined by the Kyoto Encyclopedia of Genes and Genomes (KEGG) database70,71,72. If necessary, annotations were manually curated to update or correct pathway assignments. To quantify pathway enrichment, the query set was defined as the union of the upper quartile of differential vulnerabilities defined by both the original gene vulnerability calling method (ΔV) and the Fmin approach. The background set was defined as all annotated Mtb genes. Enrichment of the pathways identified as differentially vulnerable was calculated by an odds ratio and significance was determined with a Fisher’s exact test.

    phyOverlap

    To detect associations between gene variants and Rif resistance, we employed a phylogenetic convergence test using the phyOverlap algorithm73 (https://github.com/Nathan-d-hicks/phyOverlap). In brief, FASTQ files were aligned to H37Rv genome (NC_018143.2) using bwa (version 0.7.17-r1188). FASTQ accession numbers are provided in Supplementary Table 3. Single-nucleotide polymorphisms (SNPs) were called and annotated using the HaplotypeCaller tool Genome Analysis Toolkit (version 3.5) using inputs from samtools (version 1.7). SNP sites with less than 10x coverage or missing data in >10% strains were removed from the analysis. Repetitive regions of the genome (PE/PPE genes, transposases, and prophage genes) are excluded from the analysis. Known drug-resistance regions were further excluded so as not to bias phylogenetic tree construction. M. canetti was provided as an outgroup (NC_015848). We performed Maximum Likelihood Inference using RAxML (v8.2.11) to construct the ancestral sequence and determine the derived state of each allele. Overlap with Rif resistance was scored by dividing the number of genotypically predicted (Mykrobe v0.9.012) RifR isolates containing a derived allele by the total number of isolates with a derived allele at a given genomic position. To generate a gene-wide score, we excluded synonymous SNPs and averaged the individual nonsynonymous SNP scores, weighting the scores by the number of times derived alleles evolved across the phylogenetic tree. The significance of the overlap is then tested by redistributing mutation events for each SNP randomly across the tree and recalculating the score. This permutation is done 50,000 times to derive the P value. This analysis additionally used FastTree (version 2.1.11) and figTree (v1.4.4).

    dN/dS calculations

    The ratio of nonsynonymous (dN) to synonymous (dS) nucleotide substitutions was used to quantify selective pressure acting on nusG and rpoC. A dN/dS value less than one suggests negative or purifying selection whereas a dN/dS value greater than one suggests positive or diversifying selection. For this analysis, we used a collection of ~50,000 Mtb clinical isolate whole-genome sequences, as described41. Isolates were grouped based on the presence of genotypically predicted Rif resistance (Mykrobe v0.9.012), as well as the identity of the rpoB mutation (S450X or H445X; where X indicates any amino acid other than Ser or His, respectively) conferring RifR. The number of samples used in the nusG dN/dS analysis shown in Fig. 3 are as follows: 1,365 RifS, 350 RifR, 270 S450X, and 26 H445X. The number of samples used in the rpoC dN/dS analysis shown in Fig. 3 are as follows: 23,024 RifS, 13,993 RifR, 11,067 S450X, and 1,215 H445X. Insertions and deletions were necessarily excluded from this analysis. A bootstrap-analysis was performed to calculate the dN/dS ratios to reduce any potential effects of recent clonal expansion events or convergent evolution of a specific site, like acquired drug-resistance mutations, as performed previously44. The analysis was performed by sub-sampling 80% of total variants in each group. The sub-sampling was repeated 100 times. dN/dS values were calculated for each subset of samples using a python script obtained from the github repository: https://github.com/MtbEvolution/resR_Project/tree/main/dNdS.

    SNP calling and upset plot

    SNP information for all Mtb clinical isolate whole-genome sequences were called as follows. FASTQ reads were aligned to the H37Rv genome (NC_018143.2) and SNPs were called and annotated using Snippy9 (version 3.2-dev) using default parameters (minimum mapping quality of 60 in BWA, samtools base quality threshold of 20, minimum coverage of 10, minimum proportion of reads that differ from reference of 0.9). Mapping quality and coverage was further assessed using QualiMap with the default parameters (version 2.2.2-dev). Samples with a mean coverage < 30, mean mapping quality ≤ 45, or GC content ≤ 50% or ≥ 70% were excluded. Drug resistance-conferring SNPs were annotated using Mykrobe (v0.9.012). The resulting SNP and drug-resistance calls were used to generate the values depicted in the upset plot.

    Phylogenetic trees

    Phylogenetic trees based on SNP calls described above were built using FastTree (version 2.1.11 SSE3). A list of SNPs in essential genes was concatenated to build phylogenetic trees. Indels, drug resistance-conferring SNPs, and SNPs in repetitive regions of the genome (PE/PPE genes, transposases and prophage genes) were excluded. Tree visualization was performed in iTol (https://itol.embl.de/).

    Barcode library production

    The barcode library was designed to include over 100,000 random 18-mer sequences cloned into an Giles-integrating backbone (attP only, no Integrase) containing a hygromycin resistance cassette with a premature stop codon (plNP472). Oligonucleotides were synthesized as a gBlocks Library by IDT, containing 104,976 fragments.

    plNP472 (1.6 μg) was digested with PciI (NEB R0655) and gel-purified (QIAGEN 28706). The library was PCR amplified using NEBNext High-Fidelity 2X PCR Master Mix (NEB M0541L). One 50-μl reaction was prepared, containing 25 μl of PCR master mix, 0.0125 pmol of the gBlock library, and a final concentration of 0.5 μM of the appropriate forward and reverse primers (Fwd: 5′-TTACGCGTTTCACTGGCCGATTG-3′ + Rev: 5′-TTTTGCTGGCCTTTTGCTCAAC-3′). PCR cycling conditions were: 98 °C for 30 s; 15 cycles of 98 °C for 10 s, 68 °C for 10 s, 72 °C for 15 s; 72 °C for 120 s. The PCR amplicon were purified using the QIAGEN MinElute PCR purification kit (QIAGEN 28004). One Gibson assembly reaction (NEB E2621) was prepared with 0.01 pmol μl−1 digested plNP472 backbone, 0.009 pmol μl−1 cleaned PCR amplicon, and master mix, representing a 1:2 molar ratio of vector:insert.

    Following incubation at 50 °C for 1 h, 7 μl the Gibson product was dialysed to remove salts and transformed into 100 μl MegaX DH10B T1R Electrocomp Cells (Invitrogen C640003) diluted with 107 μl 10% glyerol. For each of three total transformations, 75 μl of the cells:DNA mix was transferred to a 0.1 cm electroporation cuvette (Bio-Rad 1652089) and electroporated at 2,000 V, 200 ohms, 25 μF. Transformations were washed twice with 300 μl provided recovery medium and recovered in a total of 3 ml medium. Cells were allowed to recover at 37 °C with gentle rotation. Recovered cells were plated across three plates of LB agar supplemented with zeocin. After 1 d incubation at 37 °C, transformants were scraped and pooled. One fourth of the pellet (3.2 g dry mass) was used to perform 24 minipreps using a QIA prep Spin Miniprep Kit (Qiagen 27104).

    Transformation of barcode library into Mtb

    The barcode library was transformed into RifS and βS450L Mtb expressing RecT (mycobacteriophage recombinase) similarly to the CRISPRi library (see CRISPRi library transformation), with minor modifications. In brief, cultures for competent cells were grown in 7H9 supplemented with kanamycin to retain the episomal recT encoding plasmid (plRL4). Twenty-millilitre cultures were concentrated ten times and transformed with 250 ng of library and 100 ng of non-replicating, Giles integrase containing plasmid (plRL40). Additionally, after recovery cells were plated on 7H10 agar supplemented with kanamycin and zeocin. Transformants were scrapped after 29 days of outgrowth.

    ssDNA recombineering and validation of strains

    Clinical nusG, rpoB and rpoC mutants were introduced into RifS and βS450L Mtb using oligonucleotide-mediated (ssDNA) recombineering, as described previously68. In brief, 70-mer oligonucleotides were designed to correspond to the lagging strand of the replication fork, with the desired mutation in the middle of the sequence. Alterations were chosen to avoid recognition by the mismatch-repair machinery of RecT expression was induced ~16 h before transformation by addition of ATc to a final concentration of 0.5 μg ml−1. 400 μl of competent cells were transformed with 5 μg of mutation containing oligonucleotide and 0.1 μg of hygromycin resistance cassette repair oligonucleotide (1:50 ratio of mutant oligonucleotide to repair oligonucleotide) and recovered in 5 ml 7H9 medium.

    After 24 h of recovery, 200 μl of cells were plated on 7H10 plates supplemented with hygromycin. After 21 days of outgrowth, 12 colonies per construct were picked into 100 μl 7H9 medium supplemented with hygromycin in a 96 well plate (Fischer Scientific 877217). 50 μl of culture were heat-inactivated at 80 °C for 2 h in a sealed microamp 96 well plate (Fischer Scientific 07200684; Applied Biosystems N8010560). Fifty microlitres of heat-inactivated culture was mixed with 50 μl of 25% DMSO and lysed at 98 °C 10 min.

    Mutations of interest and unique barcodes were confirmed with PCR amplification and Sanger sequencing. The region of interest was PCR amplified with NEBNext High-Fidelity 2X PCR Master Mix (NEB M0541L) using 0.5 μl of heat-lysed product with the appropriate primers, annealing temperatures and extension times (see Supplementary Table 4). Residual PCR primers were removed with NEB Shrimp Alkaline Phosphatase (rSAP) and exonuclease I (exo) (rSAP- NEB M0371; exo- NEB M0293) per manufacturer’s instructions. Amplicons were then submitted for Sanger sequencing. One to three unique independent isolates were generated for all tested mutations.

    Pooled barcode competitive growth assay

    Validated mutants were first grown in 1 ml 7H9 with hygromycin and after 3 days, expanded to 5 ml 7H9 with hygromycin. Strains were pooled to contain approximately 1.2 × 107 cells for each mutant. The pool was then diluted to a starting OD600 of 0.01 in 7H9 supplemented with hygromycin. At this point, three 20 ml cultures in vented tissue culture flasks (T-75; Falcon 353136) were expanded to late-log phase and used as input for the competitive growth experiment. Sixteen OD600 units of cells were collected from flask as the input culture (generation 0). Triplicate cultures were then diluted back to OD600 = 0.05 and grown for ~4.5 generations, back-diluted again to OD600 = 0.05 and grown for an additional 4 generations. After this, cultures were collected for a cumulative 8.5 generations of competitive growth.

    Genomic DNA extraction and library preparation for next-generation sequencing followed the same protocol as that of the CRISPRi libraries (see above), with minor modifications. In brief, the barcode region was amplified from 100 ng genomic DNA using NEBNext Ultra II Q5 master Mix (NEB M0544L). PCR cycling conditions were: 98 °C for 45 s; 16 cycles of 98 °C for 10 s, 64 °C for 30 s, 65 °C for 20 s; 65 °C for 5 min. Each PCR reaction contained a unique indexed forward primer (0.5 μM final concentration) and a unique indexed reverse primer (0.5 μM) (see Supplementary Table 4). Additionally, individual PCR amplicons were multiplexed into a 1 nM pool and sequenced on an Illumina sequencer according to the manufacturer’s instructions. To increase sequencing diversity, a PhiX spike-in of 20% was added to the pool (PhiX sequencing control v3; Illumina FC-110-3001). Samples were run on the Illumina MiSeq Nano platform (paired-read 2 ×150 cycles, 8 × i5 index cycles, and 8 × i7 index cycles).

    WGS and SNP calling for passaging timepoints and ssDNA recombinants

    Genomic DNA (gDNA) was extracted as described above. gDNA was diluted and subjected to Illumina whole-genome sequencing by SeqCenter. In brief, Illumina libraries were generated through tagmentation-based and PCR-based Illumina DNA Prep kit and custom IDT 10 bp unique dial indices, generating 320 bp amplicons. Resulting libraries were sequenced on the Illumina NovaSeq 6000 platform (2 × 150 cycles). Demultiplexing quality control, and adapter trimming was performed with bcl-convert (v4.1.5).

    Reads were aligned to the Mtb (H37Rv; CP003248.2) reference genome using bwa (v1.3.1) with default parameters. Variant detection was performed by Snippy (v4.6.0)/freebayes (v1.3.1). Resulting vcf files were inspected for compensatory mutations (Supplementary Table 2) in rpoABC and/or the presence of the desired mutation.

    Definition of putative compensatory nusG, rpoA, rpoB, rpoC variants

    Compensatory mutations in rpoA, rpoB and rpoC were taken from published sources and are described in Supplementary Table 2. Inclusion as a putative compensatory mutation in our list required that each reported variant in rpoA, rpoB, or rpoC was found specifically in Rif-resistant strains, defined here as meaning that ≥90% of all strains harbouring the putative compensatory mutation were genotypically predicted (gDST) RifR. The use of the ≥90% gDST RifR cut-off allows for presumptive instances of incorrect gDST calls for strains harbouring rare compensatory variants. The strains used for this analysis are the approximately 50,000 Mtb WGS strain collection described previously41.

    The rules to define putative compensatory nusG mutations are as follows. Each nusG variant observed was assessed according to the following three rules and, if it met one of them, was deemed a putative compensatory variant.

    1. (1)

      The nusG variant was found in ≥80% genotypically predicted (gDST) RifR strains and was present in at least two distinct Mtb (sub)lineages. The use of the ≥80% gDST RifR cut-off allows for presumptive instances of incorrect gDST calls for strains harbouring rare nusG variants.

    2. (2)

      The nusG variant was found in 100% gDST RifR strains but only present in a single Mtb sublineage, but the same or nearby NusG site (±5 amino acids) was also mutated to an alternative amino acid that met the criteria stated in rule 1.

    3. (3)

      Residues based on the Mtb NusG–RNAP structure13 that were predicted to be important for the NusG pro-pausing activity (for example, NusG Trp120).

    The rules to define a putative compensatory mutation in the rpoB β-protrusion were similar to those described for nusG, except that only rpoB β-protrusion residues at or near the NusG interface (RpoB Arg392–Thr410) were included in the analysis. Note that two such β-protrusion mutations (Thr400Ala and Gln409Arg) were previously identified as putative compensatory mutation17,74,75 (Supplementary Table 2).

    RifR rpoB allele frequency distribution calculations

    To check whether the observed distribution of RifR rpoB mutations was different for each of the three groups (all RifR strains in our clinical strain genome database, those harbouring known compensatory mutations in rpoA or rpoC, or those harbouring compensatory mutations in nusG or the β-protrusion), we performed a chi-squared test on the observed RifR rpoB mutant frequencies. Specifically, we take the RifR rpoB mutant frequencies observed in all RifR samples as representing an estimate of the base probabilities under the null hypothesis. We then use these base probabilities to calculate the frequency of mutations that would be expected in the other groups, based on the null hypothesis. That is:

    For each mutation (m):

    $$p(m)=\frac{{\rm{Number}}\,{\rm{of}}\,{\rm{times}}\,m\,{\rm{occurs}}\,{\rm{in}}\,{\rm{RifR}}\,{\rm{samples}}}{{\rm{Total}}\,{\rm{number}}\,{\rm{of}}\,{\rm{RifR}}\,{\rm{samples}}}$$

    For each group (G) and mutation (m),

    $$E\left[m| G\right]=p\left(m\right)\times {\rm{total}}\,{\rm{number}}\,{\rm{of}}\,{\rm{samples}}\,{\rm{in}}\,G$$

    Protein expression and purification

    Mtb RNAP

    Mtb RNAP was purified as previously described66,76. In brief, plasmid pMP61 (wild-type RNAP) or pMP62 (S450L RNAP) was used to overexpress Mtb core RNAP subunits rpoA, rpoZ, a linked rpoBC and a His8 tag. pMP61/pMP62 was grown in E. coli Rosetta2 cells in LB with 50 μg ml−1 kanamycin and 34 μg ml−1 chloramphenicol at 37 °C to an OD600 of 0.3, transferred to room temperature and left shaking to an approximate OD600 of 0.6. RNAP expression was induced by adding IPTG to a final concentration of 0.1 mM, grown for 16 h, and collected by centrifugation (8,000g, 15 min at 4 °C). Collected cells were resuspended in 50 mM Tris-HCl, pH 8.0, 1 mM EDTA, 1 mM PMSF, 1 mM protease inhibitor cocktail, 5% glycerol and lysed by sonication. The lysate was centrifuged (27,000g, 15 min, 4 °C) and polyethyleneimine (PEI, Sigma-Aldrich) added to the supernatant to a final concentration of 0.6% (w/v) and stirred for 10 min to precipitate DNA binding proteins including target RNAP. After centrifugation (11,000g, 15 min, 4 °C), the pellet was resuspended in PEI wash buffer (10 mM Tris-HCl, pH 7.9, 5% v/v glycerol, 0.1 mM EDTA, 5 mM DTT, 300 mM NaCl) to remove non-target proteins. The mixture was centrifuged (11,000g, 15 min, 4 °C), supernatant discarded, then RNAP eluted from the pellet into PEI Elution Buffer (10 mM Tris-HCl, pH 7.9, 5% v/v glycerol, 0.1 mM EDTA, 5 mM DTT, 1 M NaCl). After centrifugation, RNAP was precipitated from the supernatant by adding (NH4)2SO4 to a final concentration of 0.35 g l−1. The pellet was dissolved in Nickel buffer A (20 mM Tris pH 8.0, 5% glycerol, 1 M NaCl, 10 mM imidazole) and loaded onto a HisTrap FF 5 ml column (GE Healthcare Life Sciences). The column was washed with Nickel buffer A and then RNAP was eluted with Nickel elution buffer (20 mM Tris, pH 8.0, 5% glycerol, 1 M NaCl, 250 mM imidazole). Eluted RNAP was subsequently purified by gel filtration chromatography on a HiLoad Superdex 26/600 200 pg in 10 mM Tris pH 8.0, 5% glycerol, 0.1 mM EDTA, 500 mM NaCl, 5 mM DTT. Eluted samples were aliquoted, flash frozen in liquid nitrogen and stored in −80 °C until usage.

    Mtb σA–RbpA

    Mtb σA–RbpA was purified as previously described76,77. The Mtb σA expression vector pAC2 contains the T7 promoter, ten histidine residues, and a precision protease cleavage site upstream of Mtb σA. The Mtb RbpA vector is derived from the pET-20B backbone (Novagen) and contains the T7 promoter upstream of untagged Mtb RbpA. Both plasmids were co-transformed into E. coli Rosetta2 cells and selected on medium containing kanamycin (50 µg ml−1), chloramphenicol (34 µg ml−1) and ampicillin (100 µg ml−1). Protein expression was induced at OD600 of 0.6 by adding IPTG to a final concentration of 0.5 mM and leaving cells to grow at 30 °C for 4 h. Cells were then collected by centrifugation (4,000g, 20 min at 4 °C). Collected cells were resuspended in 50 mM Tris-HCl, pH 8.0, 500 mM NaCl, 5 mM imidazole, 0.1 mM PMSF, 1 mM protease inhibitor cocktail, and 1 mM β-mercaptoethanol, then lysed using a continuous-flow French press. The lysate was centrifuged twice (15,000g, 30 min, 4 °C) and the proteins were purified by Ni2+-affinity chromatography (HisTrap IMAC HP, GE Healthcare Life Sciences) via elution at 50 mM Tris-HCl, pH 8.0, 500 mM NaCl, 500 mM imidazole, and 1 mM β-mercaptoethanol. Following elution, the complex was dialysed overnight into 50 mM Tris-HCl, pH 8.0, 500 mM NaCl, 5 mM imidazole, and 1 mM β-mercaptoethanol and the His10 tag was cleaved with precision protease overnight at a ratio of 1:30 (protease mass:cleavage target mass). The cleaved complex was loaded onto a second Ni2+-affinity column and was retrieved from the flow-through. The complex was loaded directly onto a size-exclusion column (SuperDex-200 16/16, GE Healthcare Life Sciences) equilibrated with 50 mM Tris-HCl, pH 8, 500 mM NaCl, and 1 mM DTT. The sample was concentrated to 4 mg ml−1 by centrifugal filtration and stored at –80 °C until usage.

    Mtb CarD

    Mtb CarD was purified as previously described66,76. In brief, Mtb CarD was overexpressed from pET SUMO (Invitrogen) in E. coli BL21(DE3) cells (Novagen) and selected on medium containing 50 µg ml−1 kanamycin. Protein expression was induced by adding IPTG to a final concentration of 1 mM when cells reached an apparent OD600 of 0.6, followed by 4 h of growth at 28 °C, then collected by centrifugation (4,000g, 15 min at 4 °C). Collected cells were resuspended in 20 mM Tris-HCl, pH 8.0, 150 mM potassium glutamate, 5 mM MgCl2, 0.1 mM PMSF, 1 mM protease inhibitor cocktail, and 1 mM β-mercaptoethanol, then lysed using a continuous-flow French press. The lysate was centrifuged twice (16,000g, 30 min, 4 °C) and the proteins were purified by Ni2+-affinity chromatography (HisTrap IMAC HP, GE Healthcare Life Sciences) via elution at 20 mM Tris-HCl, pH 8.0, 150 mM potassium glutamate, 250 mM imidazole, and 1 mM β-mercaptoethanol. Following elution, the complex was dialysed overnight into 20 mM Tris-HCl, pH 8.0, 150 mM potassium glutamate, 5 mM MgCl2, and 1 mM β-mercaptoethanol and the His10 tag was cleaved with ULP-1 protease (Invitrogen) overnight at a ratio of 1/30 (protease mass/cleavage target mass). The cleaved complex was loaded onto a second Ni2+-affinity column and was retrieved from the flow-through. The complex was loaded directly onto a size-exclusion column (SuperDex-200 16/16, GE Healthcare Life Sciences) equilibrated with 20 mM Tris-HCl, pH 8, 150 mM potassium glutamate, 5 mM MgCl2 and 2.5 mM DTT. The sample was concentrated to 5 mg ml−1 by centrifugal filtration and stored at –80 °C.

    Wild-type Mtb NusG (+ mutants N65H, R124L and N125S)

    Plasmid pAC82 (or mutant variation) was used to overexpress wild-type Mtb NusG13. Plasmids encoding NusG mutants were generated using Q5 Site-directed mutagenesis (NEB) and sequenced to confirm the presence of target mutations. E. coli BL21 cells containing plasmids encoding different versions of Mtb NusG were grown in LB with 50 μg ml−1 kanamycin at 37 °C to an OD600 of 0.4, then transferred to room temperature and left shaking to an OD600 of 0.67. Protein expression was induced by adding IPTG to a final concentration of 0.1 mM, grown for an additional 4 h, then collected by centrifugation (4,000g, 20 min at 4 °C). Collected cells were resuspended in 50 mM Tris-HCl, pH 8.0, 500 mM NaCl, 5 mM imidazole, 10% glycerol, 1 mM PMSF, 1 mM protease inhibitor cocktail (Roche), 2 mM β-mercaptoethanol, and lysed by French press. The lysate was centrifuged (4,000 rpm for 20 min, 4 °C) and the supernatant was removed and applied to a HisTrap column pre-washed with 50 mM Tris-HCl, pH 8.0, 500 mM NaCl, 10% glycerol, 15 mM imidazole, and 2 mM β-mercaptoethanol. After loading the sample, the column was washed with five volumes of the same buffer, before gradient elution with 50 mM Tris-HCl, pH 8.0, 500 mM NaCl, 10% glycerol, 250 mM imidazole, and 2 mM β-mercaptoethanol. The eluted protein was mixed with precision protease and dialysed overnight at 4 °C in 20 mM Tris-HCl, pH 8.0, 500 mM NaCl, 10 mM β-mercaptoethanol to cleave the N-terminal His10 tag before applying to a HisTrap column to remove the uncleaved protein. The flow-through was collected and glycerol was added to a final concentration of 20% (v/v). Aliquots were flash frozen in liquid nitrogen and stored in –80 °C until use.

    Promoter-based in vitro termination assays

    The DNA sequence for the Mtb H37Rv 5 S rRNA (rrf gene) intrinsic terminator was taken from Mycobrowser (MTB000021), with genomic coordinates of 1,476,999 to 1,477,077 basepairs. The intrinsic terminator was found by predicting its RNA structure using mfold (RNA folding form v2.3) via the UNAFold Web Server. The intrinsic terminator was cloned downstream of a cytidine-less halt cassette in plasmid pAC7038, a gift of the R. Landick laboratory, using Q5 site-directed mutagenesis (following manufacturer’s protocol – NEB) at an annealing temperature of 59 °C with GC enhancer for the PCR step, with primers 5′-TGGTGTTTTTGTATGTTTATATCGACTCAGCCGCTCGCGCCATGGACGCTCTCCTGA-3′ and 5′-CCGTTACCGGGGGTGTTTTTGTATGTTCGGCGGTGTCCTGGATCCTGGCAGTTCCCT-3′ (synthesized by IDT), to create plasmid pJC1. The 323 base pairs linear DNA fragment used for in vitro transcription assays was PCR amplified using Accuprime Pfx DNA polymerase (Invitrogen) at an annealing temperature of 56.5 °C, with primers 5′-GAATTCAAATATTTGTTGTTAACTCTTGACAAAAGTGTTAAAAGC-3′ and 5′-GTTGCTTCGCAACGTTCAAATCC-3′ (synthesized by IDT), following manufacturer instructions, and PCR purified (using the QIAquick PCR Purification Kit (QIAGEN)) to remove protein contents and buffer exchange into 10 mM Tris-HCl pH 8.5.

    pJC1 contains the rrf termination site at approximately +150 bp. This template also contained a C-less cassette (+1 to +26). Core RNAP was incubated for 15 min at 37 °C with σA/RbpA in transcription buffer (20 mM Tris, 25 mM KGlu, 10 mM MgOAc, 1 mM DTT, 5 µg ml−1 BSA) to form holo-RNAP, followed by 10 min incubation with 500 nM CarD at 37 °C. Holo-RNAP (200 nM) was then incubated with template DNA (10 nM) for 15 min at 37 °C. To initiate transcription, the complex was incubated with ATP + GTP (both at 16 µM), UTP (2 µM), and 0.1 µl per reaction [α-32P]UTP for 15 min at 37 °C to form a halted complex at U26. Transcription was restarted by adding a master mix containing NTP mix (A + C + G + U), heparin, and NusG at a final concentration of 150 µM (each NTP), 10 µg ml−1 (heparin), and 1 µM NusG at 23 °C. The reaction was allowed to proceed for 30 min, followed by a ‘chase’ reaction in which all 4 nucleotides were added to a final concentration of 500 µM each. After 10 min, aliquots were removed and added to a 2× Stop buffer (95% formamide, 20 mM EDTA, 0.05% bromophenol blue, 0.05% xylene cyanol). Samples were analysed on an 8% denaturing PAGE (19:1 acrylamide: bis acrylamide, 7 M urea, 1X TBE pH= 8.3) for 1.25 h at 400 V, and the gel was exposed on a Storage Phosphor Screen and imaged using a Typhoon PhosphoImager (GE Healthcare).

    Quantification of termination and changes in termination

    Synthesized RNA bands on the gel image were quantified using ImageJ software (NIH). Each lane from below the rrf termination site (~150 nt) to above the runoff RNA products (263 nt) was converted to a pseudo-densitometer plot using the ImageJ line function and the relative areas of the termination and runoff bands were measured. Termination efficiency (TE) was calculated as the fraction of the termination (term) peak area relative to total of the termination and runoff (term + runoff) peak areas. Fold changes in termination attributable to each NusG (∆T) were determined as the aggregate of changes in the termination rates kb and kt, as defined by von Hippel and Yager (equations (1) and (2))62,63. Multiple algebraic transforms can yield the aggregate fold changes in termination, ∆T, based on the following equations.

    $${\rm{TE}}=\frac{{k}_{t}}{{k}_{t}+{k}_{b}}$$

    (1)

    $${\rm{TE}}={\left[1+{{\rm{e}}}^{-\Delta \Delta {G}^{\ddagger }/-RT}\right]}^{-1},$$

    (2)

    where ∆∆G is the difference in activation barriers between termination and bypass, which is most directly related to the energies of RNAP–NusG and internal RNAP interactions that govern termination.

    $${\Delta \Delta G}^{\ddagger }=-RT\times {\rm{ln}}\left(\left(1/{\rm{TE}}\right)-1\right)$$

    (3)

    (equation (2) rearranged).

    $$\Delta T={{\rm{e}}}^{\left({\Delta \Delta G}_{2}^{\ddagger }-{\Delta \Delta G}_{1}^{\ddagger }\right)}$$

    (4)

    (fold change in aggregate termination rates for two conditions, 1 and 2).

    $$\Delta T=\frac{\left(\frac{1}{{{\rm{TE}}}_{2}}\right)-1}{\left(\frac{1}{{{\rm{TE}}}_{1}}\right)-1}$$

    (5)

    (alternative calculation derived from equation (1) assuming NusG only affects kb).

    Calculating ∆T using either the combinations of equations (3) and (4) or using equation (5) gives the same results because the ∆T is the same whether conditions differ by aggregate effects on both kb and kt or an effect on only one of them. We calculate ∆T using these approaches rather than the simple difference in energies of activation (\(\Delta \Delta {G}_{2}^{\ddagger }-\Delta \Delta {G}_{1}^{\ddagger }\)) because it allows a clearer graphical depiction of effects without changing the results. Errors in ∆T were calculated using a two-sided, unpaired t-test with no assumptions on variance.

    Electrophoretic mobility shift assay

    RNAP–NusG complexes were assembled and run on an electrophoretic mobility shift assay to test proper binding of all mutant NusGs. Core RNAP (200 nM) was incubated with the template strand of elongation scaffold DNA13 (50 nM) for 15 min at room temperature. Next, the complex was incubated with the complementary non-template strand (50 nM) for 15 min at room temperature. Finally, the complex was incubated with 1 µM wild-type NusG, N65H NusG, R124L NusG, or N125S NusG for 10 min at room temperature. All complexes were assembled in the following transcription buffer: 20 mM Tris, 25 mM potassium glutamate, 10 mM magnesium acetate, 1 mM DTT, 5 µg ml−1 BSA. Samples were immediately loaded and run on a native PAGE (4.5% acrylamide:bis solution 37.5:1, 4% glycerol, 1× TBE) for 1 h at 15 mA. The gel was run at 4 °C. The gel was first stained with GelRed (Biotium) followed by Coomassie blue for visualization of DNA and protein respectively.

    Reporting summary

    Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

    [ad_2]

    Source link