Lignin-derived (e.g. phenolic) compounds can compromise the bioconversion of lignocellulosic biomass to fuels and chemicals due to their toxicity and recalcitrance. The lipid-accumulating bacterium Rhodococcus opacus PD630 has recently emerged as a promising microbial host for lignocellulose conversion to value-added products due to its natural ability to tolerate and utilize phenolics. To gain a better understanding of its phenolic tolerance and utilization mechanisms, we adaptively evolved R. opacus over 40 passages using phenol as its sole carbon source (up to 373% growth improvement over wild-type), and extensively characterized two strains from passages 33 and 40. The two adapted strains showed higher phenol consumption rates (∼20 mg/l/h) and ∼2-fold higher lipid production from phenol than the wild-type strain. Whole-genome sequencing and comparative transcriptomics identified highly-upregulated degradation pathways and putative transporters for phenol in both adapted strains, highlighting the important linkage between mechanisms of regulated phenol uptake, utilization, and evolved tolerance. Our study shows that the R. opacus mutants are likely to use their transporters to import phenol rather than export them, suggesting a new aromatic tolerance mechanism. The identified tolerance genes and pathways are promising candidates for future metabolic engineering in R. opacus for improved lignin conversion to lipid-based products.
Lignocellulosic biomass, comprised of cellulose, hemicellulose and lignin (1,2), remains an underutilized substrate in sustainable microbial production of fuels and chemicals (3–6). One main challenge is that current biorefinery pretreatment approaches release diverse toxic degradation compounds from lignin during conversion of lignocellulosic biomass to fermentable sugars (7). These lignin degradation compounds include a wide array of phenolics that can severely inhibit microbial production of fuels or chemicals, leading to lower yields (8). Currently, unconverted lignin is typically burned to provide thermal energy onsite, but the amount of waste lignin is predicted to escalate as lignocellulose-based biorefinery output increases (4,9,10).
Lignin, the second most abundant terrestrial polymer, constitutes ∼15–30% of lignocellulose (11) and is more energy dense than cellulose and hemicellulose due to its higher carbon-to-oxygen ratio. Unfortunately, lignin is much more difficult to depolymerize due to its complex molecular structure. Structural heterogeneity also leads to a broad spectrum of breakdown products, substantially compromising the efficiency of chemical catalysis approaches for product synthesis and purification. Some bacteria and fungi can consume lignin breakdown products and utilize them as carbon sources (12), potentiating fuel and chemical production via lignin consolidated bioprocessing (13–15). One such bacterium, Rhodococcus opacus, is a promising microbial host for converting lignocellulose to useful products due to its naturally robust lipid production and ability to both tolerate and metabolize diverse phenolic compounds (14,16–18).
Rhodococcus strains are found in diverse environments (19,20) and can tolerate environmental stresses such as desiccation and high salinity (21,22) as well as chemical stresses such as high concentrations of butanol (23,24). Often isolated from polluted or contaminated environmental samples (25,26), R. opacus strains have a strong innate tolerance to benzene, toluene and lignocellulosic hydrolysates from different sources (27,28) and can metabolize aromatic compounds (14,16). Rhodococcus species can convert aromatics to acetyl-CoA and succinyl-CoA (12,29), which are important precursors for converting phenolics to bioproducts (30). Originally isolated from soil at a gas works plant, R. opacus PD630 (hereafter R. opacus unless specified) is known to accumulate large amounts of the biodiesel precursors triacylglycerols (TAGs, up to 76% of cell dry weight) when using sugars as a carbon source. Thus, R. opacus has been a target strain for commercial-scale lipid production using sugars derived from lignocellulose (18,31–34).
Growth inhibition by toxic compounds (either end-product or feedstock) is a major limiting factor for commercialization of biochemical processes (35,36). Developing production hosts with natural tolerance to toxic inhibitors may significantly reduce time and efforts for host optimization. The tolerance capabilities of R. opacus are hypothesized to come from its highly hydrophobic cell wall (22) and/or its ability to consume a diverse array of compounds (20), but few studies have directly investigated phenolic tolerance mechanisms in this organism. While we have recently explored the central metabolism in wild-type (WT) R. opacus and found simultaneous utilization of glucose and phenol (i.e. no catabolite repression) using 13C-metabolite fingerprinting (37), more work is necessary to thoroughly characterize the catabolic pathways of aromatic compounds and global metabolism to elucidate tolerance mechanisms in R. opacus.
In this study, we adaptively evolved increased phenol tolerance of R. opacus by sequentially sub-culturing in phenol as a sole carbon source and screening fast-growing mutants over 40 passages. We selected phenol as a model lignin degradation product to avoid the confounding effects of many compounds present in heterogeneous lignin degradation product streams. Phenol has a shared substructure to many of the compounds that can be derived from lignin (38), and it also has a similar level of toxicity to that of other compounds derived from lignin (39,40).
Even though previous studies have demonstrated bioconversion of lignocellulose-derived compounds into lipids by R. opacus (14,27,40–44), the connection between the tolerance phenotype and specific cellular mechanisms remains elusive. Here, we present a combined adaptive evolution/omics approach leveraging multiple phenol-adapted strains to identify possible mechanisms for phenolic tolerance and utilization in R. opacus (Supplementary Figure S1). This approach builds on previous studies by examining multiple paths to phenolic tolerance from the same strain background and resolving these tolerance strategies by comparing the transcriptomic response in different growth conditions. We performed adaptive evolution of R. opacus on increasing concentrations of phenol to select for increasingly tolerant strains and identified two high-performing strains through in-depth phenotyping. Next, we performed whole-genome sequencing of the selected strains to identify genomic alterations during adaptive evolution and employed comparative transcriptomics to identify transcriptional changes between strains in different growth conditions (i.e. in glucose and different concentrations of phenol). This approach proved to be effective in gaining insights into tolerance mechanisms and identifying promising gene candidates that can facilitate future metabolic engineering efforts in Rhodococcus to produce fuels and chemicals directly from lignin breakdown products or from lignocellulosic hydrolysates rich in toxic lignin degradation compounds.
MATERIALS AND METHODS
To increase the strain's tolerance to phenol, R. opacus PD630 was adapted to grow in increasing concentrations of phenol for 40 successive subcultures. To start the experiment, the WT strain was grown in 2 ml of 0.3 g/l phenol for 2 days, and then subcultured at 0.3 g/l phenol with an initial OD600 of 0.03 during the first 10 subcultures. Once the culture reached an OD600 of 0.3 within 24–36 h, the phenol concentration was increased in increments of 0.25 g/l up to a final concentration of 1.5 g/l phenol. Frozen stocks of each subculture were saved for later testing. To pick phenol-tolerant strains, frozen stocks from intermediate subcultures were streaked onto minimal media plates supplemented with 1.5% agar containing 0.3 g/l phenol. From three of the intermediate subcultures (17, 26 and 33) and the final subculture (40), six colonies were picked for growth screening and lipid assays in 96 well plates. The two fastest growing strains with best lipid productivities (named evol33 and evol40) were chosen from this screening for further analysis. For detailed growth conditions for other experiments and assays, see Supplementary Materials and Methods.
The two best performing strains were chosen for lipid assays using Nile red. Growth experiments were started with single colonies from LB plates, and cells were cultured in 2 ml of 0.3 g/l phenol in a 50 ml glass tube for 36–48 h. Next, 2 ml of cells was subcultured in 10 ml of 0.3 g/l phenol for 24 h in a 50 ml glass tube. Next, 10 ml of cells was added to 140 ml of 0.3 g/l phenol in a 250 ml flask and grown for 24 h. Cells were centrifuged at 3000 x g for 10 min, resuspended in carbon- and nitrogen-free media, and then each condition (1 g/l glucose, 0.75 g/l phenol or 1.5 g/l phenol) was tested for each strain with an initial OD600 = 0.3 in low nitrogen conditions (0.05 g/l (NH4)2SO4) at a 10 ml volume in a 50 ml glass culture tube. Cells were stained by Nile red according to the reported method (45) with some modifications. Briefly, 200 μl of cell culture was stored in 20% DMSO at −20°C until staining. For staining, cells were centrifuged at 1000 x g and room temperature for 3 min and resuspended in 200 μl of 1x PBS. 5 μl of 1 mM Nile red in DMSO was added to each sample in a 96-well plate, which was subsequently incubated in the dark at room temperature for 30 min. The plate was wrapped in foil, centrifuged at 1000 x g for 3 min and washed twice with 0.9% NaCl solution. Flow cytometry analysis was performed using a Millipore Guava EasyCyte High Throughput flow cytometer with a 488 nm excitation laser and a 575 nm emission filter. For all data, at least 5000 events were collected and gated by forward and side scatter. Samples were stained and analyzed on the same day to decrease variability. FlowJo (TreeStar Inc.) was used to obtain the arithmetic mean of the Nile red fluorescence distributions of stained cells, and the autofluorescence of unstained cells was subtracted from each measurement value. Because of differences in lag phases between strains, the lipid productivity was calculated as the average change in Nile red fluorescence (per cell per time) between at least four time points taken from the exponential to the stationary phase.
Illumina library preparation
A protocol adapted from Bowman et al. and Fisher et al. (46,47) was used to generate Illumina sequencing libraries. Double-stranded cDNA (for RNA-Seq) or genomic DNA (for whole genome sequencing) in 50 μl elution buffer (Qiagen) was fragmented to 200–400 bp using Covaris E-220 and microTUBEs at a recommended setting by Covaris. ∼50 μl of fragmented DNA in Buffer EB (Qiagen) was recovered from Covaris tubes and transferred to a semi-skirted 96-well polymerase chain reaction (PCR) plate. 20 μl of end repair mix (7 μl 10x T4 ligase buffer, 2.8 μl 10 mM dNTPs, 3.5 μl T4 polymerase, 3.5 μl T4 polynucleotide kinase, 0.7 μl Escherichia coli DNA polymerase I and 2.5 μl nuclease-free water; all enzymes were purchased from NEB) was added to the fragmented DNA to blunt and phosphorylate the ends at room temperature for 30 min. 130 μl of AMPure XP beads (Agencourt) were added to each well to remove reagents from the previous step. The 96-well plate was placed on a magnetic plate after pipetting thoroughly to mix, and supernatant was discarded when it became clear. The beads with DNA on the walls of 96 wells were washed with 150 μl 80% ethanol once, before eluting the DNA with 32 μl of water. 18 μl of dA-tailing mix (5 μl 10x NEBuffer 2, 1 mM dATP and 3 μl Klenow fragment (3′-5′ exo-)) was added to the beads with eluted DNA to add a dA-tail at 37ºC for 30 min. After dA-tailing, 110 μl of 20% polyethylene glycol (PEG) 8000 in 2.5 M NaCl was used to precipitate DNA onto the magnetic beads. The plate was placed on a magnet again to collect DNA and washed with 80% ethanol. Barcoded adapters were added to the eluted DNA (20.5 μl water added to DNA with beads) to a 10:1 molar ratio, and 6.5 μl of ligation mix was added to ligate adapters at room temperature for 20 min. 21 μl of 20% PEG in 2.5 M NaCl was added to precipitate DNA onto the beads, and two barcoded samples were pooled after washing for elution. Adapter-ligated DNA was then PCR-amplified using KAPA HiFi Hot start Readymix (KAPA Biosystems) and gel purified using a Zymoclean Gel DNA Recovery Kit (Zymo Research). The resulting purified DNA samples were quantified using a Qubit DNA HS kit, combined and diluted with Buffer EB to yield a 10 nM pool. 20 μl of the pool was submitted to HiSeq-2500 sequencing at the Center for Genome Sciences & Systems Biology, Washington University in St. Louis School of Medicine. For other details regarding DNA and RNA preparations, see Supplementary Materials and Methods.
Genome sequencing and SNP analysis
Raw Illumina reads were demultiplexed by barcodes and trimmed using an in-house Python script, and mapped to an indexed genome of R. opacus (both annotated supercontigs maintained by Broad Institute (Genbank#ABRH01000000 (20)) and the finished genome by Chinese Academy of Science (ASM59954v1 (48)) using Bowtie2 (49). The mapped SAM files were converted to sorted BAM files with BAM index files using SAMtools (50,51). SNPs and possible indels were analyzed using Pilon (52). SNPs that were also found in the WT strain were filtered out using an in-house Python script (see Supplementary Computational Methods). Additionally, MIRA (53) was used to confirm SNPs and indels predicted by Pilon. In addition to utilizing MIRA's SNP detection function, we performed de novo-assembly of a genome for each strain using MIRA. Assembled genomes were then used to further confirm presence or absence of SNPs in each strain. This deep sequencing confirmation eliminated SNPs that were inconsistently found due to Illumina sequencing errors or biases during Illumina preparations. Table 1A shows all SNPs found in evolved strains and not in WT.
Single nucleotide polymorphisms (SNPs) found in the adapted strains. (A) All SNPs found in the adapted strains but not in the WT strain are summarized. The mutations were called by both Pilon and MIRA. (B) Expression levels of the genes listed in A, shown in DESeq2-normalized counts, using plotCounts function. (C) Expression levels of the same genes listed in A, in Reads per Kilobase of transcript per Million mapped reads (rpkm). Rpkm values are calculated as explained in the Materials and Method section, and the average of triplicate samples is shown. Glc, 1 g/l glucose; LowP, 0.75 g/l phenol; HighP, 1.5 g/l phenol.
RNA-Seq data analysis
After demultiplexing and trimming, raw Illumina reads were mapped to an indexed genome of R. opacus (both annotated supercontigs (20) and the finished genome (48)) using Bowtie2 (49). The mapped SAM files were converted to sorted BAM files with BAM index files using SAMtools (50,51). FeatureCounts (54) was used to count the number of reads per coding sequence, and the raw result was used for differential expression analysis using DESeq2 (55) as implemented in the R software package.
Adaptive evolution of R. opacus using phenol as a sole carbon source
In the adaptive evolution approach, we increased phenol concentrations in a minimal salts medium (MSM) over time to select for the fastest growing strains using phenol as a sole carbon source (Supplementary Figure S1). Growth of the WT strain on phenol was characterized by a lag phase that increases with higher concentrations of phenol (Figure 1). To prevent long lag phases in adapted strains, cells were grown in non-inhibitory concentrations of phenol for the initial rounds of selection before increasing to inhibitory phenol concentrations. Serial transfers were performed when exponential phase was reached (optical density at 600 nm (OD600) of ∼0.3). For the total adaptive evolution experiment (40 serial passages), the estimated cumulative cell divisions (56) were ∼6 × 109, assuming four doublings per subculture and an initial cell concentration of ∼1 × 107 CFU/ml (colony forming unit per ml; initial OD600 of ∼0.03 was experimentally determined to be ∼1 × 107 CFU/ml). Over the course of the 40 serial transfers (∼200 generations), six colonies each from passages 17, 26, 33 and 40 (total of 24 colonies) were randomly selected for high-throughput phenotypic screening, along with WT (Supplementary Figure S1). We chose the two best-performing phenol-adapted strains based on improved growth rate, biomass yields and lipid yields: one derived from passage 33 (named evol33) and the other from passage 40 (evol40). These two adapted strains were subjected to further phenotypic characterization (i.e. growth profiles and lipid accumulation assay), deep whole-genome sequencing and comparative transcriptomic profiling by RNA-Seq.
Phenotypic characterization of phenol-adapted strains
To characterize improvements in tolerance from adaptive evolution, both adapted and WT strains were cultivated in a range of concentrations of phenol (Figure 1). We characterized phenol tolerance by comparing growth rates, final biomass accumulation (OD600), lag phases and estimated IC50 (half maximal inhibitory concentration; Supplementary Figure S2). The adapted strains evol33 and evol40 had 20% and 24% higher IC50 values than WT, respectively, after 45 h of culture (P < 0.001 for both evol33 and evol40, one mean, two-tailed Student's t-test). Both adapted strains also reached 22–373% higher cell densities (P < 0.05) than WT in all concentrations of phenol tested where growth was observed (0.75–2 g/l), with evol33 and evol40 having a 92% and 102% increase in OD600 at 1.5 g/l (P = 0.0005 and P = 0.0002, respectively; Supplementary Table S1). The lag phase of the adapted strains was 17–54% shorter compared to that of the WT strain, with evol33 showing a 46% shorter lag phase at 1.5 g/l (P = 0.029) and evol40 showing a 34% shorter lag phase compared to WT (P = 0.054; Supplementary Table S2). The adapted strains also demonstrated significant growth rate increases at intermediate concentrations of phenol, with 30% and 38% increases in growth rates at 1.5 g/l for evol33 (P = 0.03) and evol40 (P = 0.04), respectively (Supplementary Table S3). When cells were grown in 1.5 g/l phenol, we also observed significantly higher phenol consumption rates in the adapted strains compared to the WT strain (P = 0.002 and P = 0.003 for evol33 and evol40, respectively), suggesting that phenol uptake, consumption and growth are related (Supplementary Figure S3).
To determine whether adaptive evolution affected lipid accumulation, we measured lipid amounts under nitrogen-limited conditions using either glucose or phenol as a sole carbon source (Figure 2 and Supplementary Table S4). WT R. opacus accumulates intracellular lipids in the form of TAGs under nitrogen limitation in the stationary phase (18). Strains were grown in 1 g/l glucose (‘glucose’), 0.75 g/l phenol (‘low phenol’) or 1.5 g/l phenol (‘high phenol’) as sole carbon sources. Relative lipid amounts at different time points were estimated by staining cells with Nile red and measuring single cell fluorescence, which has been used to characterize total lipid contents in many oleaginous organisms (15,45) and correlates well with total lipid content (Supplementary Figure S4). Peak lipid accumulation occurs in the mid-stationary phase, followed by a decrease in lipid contents due to lipid consumption caused by extended carbon starvation. Because cells would be harvested at peak lipid accumulation for an industrial-scale process, we used the maximum Nile red fluorescence in the stationary phase to compare the lipid titer (Supplementary Table S4) between strains. The adapted strains showed similar maximum Nile red fluorescence to that of the WT strain when grown in glucose (P = 0.1 and P = 0.09 for evol33 and evol40, respectively). In contrast, when grown in low phenol, evol33 and evol40 showed a 2.5-fold (P = 5 × 10−5) and 1.9-fold (P = 0.001) increase in maximum Nile red fluorescence compared to WT, respectively (Supplementary Table S4). The adapted strains also showed high lipid accumulation in high phenol (Supplementary Table S4 and Supplementary Figure S5). In addition to titer, productivity is also an important factor for industrial scale processes, and we compared lipid productivity between strains by examining the average increase in Nile red fluorescence over time in exponential to stationary phase (Figure 2, Supplementary Table S5, and Supplementary Figure S5). Here, we define lipid productivity as the lipid amount per cell per unit time. In glucose, all three strains showed similar lipid productivities ranging from 7.8 to 8.4 arbitrary unit (a.u.) h−1. While the WT strain demonstrated a lower lipid productivity in low phenol (5.8 a.u. h−1), the adapted strains showed similar or higher lipid productivities in low phenol compared to glucose conditions (13.6 and 8.9 a.u. h−1 for evol33 and evol40, respectively). Compared to the WT strain, evol33 and evol40 has a 2.3- and 1.5-fold increase in lipid productivity in low phenol. The adapted strains also had higher lipid productivity in high phenol compared to glucose conditions (16.5 a.u. h−1, P = 0.001 for evol33; and 19.4 a.u. h−1, P = 0.0001 for evol40). These results prompted us to further characterize these strains using whole-genome sequencing and RNA-Seq.
Deep whole-genome sequencing of phenol-adapted strains
We sequenced the genomes of phenol-adapted R. opacus strains evol33 and evol40 along with the original wild-type strain to > 90X-coverage. For a reference genome, we used both annotated supercontigs maintained by the Broad Institute (Genbank # ABRH01000000 (20); OPAG_#s) and the finished genome by the Chinese Academy of Science (ASM59954v1 (48); LPD#s). Contrary to our initial expectations, there were no single nucleotide polymorphisms (SNPs) or insertion/deletions (indels) in promoter regions, transcriptional regulators, or sensors. In fact, the numbers of mutations were low, and when excluding mutations that were found in WT as well, there were only three consistent mutations (which were not due to sequencing errors) in evol40 and four mutations in evol33. While we explored the possibility of large duplications as genomes of Rhodococcus species are known to contain duplications (57–59)), we could not obtain consistent large duplication prediction based on short read sequencing only. The three common point mutations shared in the genomes of both evol33 and evol40 (Table 1A) are the following: LPD03112, annotated as ‘cytochrome c oxidase/cytochrome aa3 quinol oxidase’, contained a Phe to Ser mutation at amino acid position 335. LPD03747, annotated as ‘dicarboxylate carrier protein/citrate transporter/MFS transporter’, contained an Ile296 to Phe mutation. The third shared mutation occurred in OPAG_05248, which is annotated as ‘acyl-CoA thioesterase II’, and had a silent mutation of G to A (Phe120) when mapped to the supercontigs (20). When mapped to the complete genome (48), however, this mutation occurred in an intergenic region between LPD04711 and LPD04712 (Table 1A). Regardless of the annotations, overall reads mapped to this region were very low (reads per kilobase of transcript per million reads (rpkm) 0–10) and not phenol-responsive (Table 1B,C), suggesting caution when considering the biological implication of this mutation. evol33 additionally had a gene LPD00118 with a SNP; this gene is annotated as ‘4-hydroxyphenylacetate permease/MFS transporter’ (Gly162 to Val). Given two transporter/permease genes containing consistently found SNPs, these results suggest that adapted strains have altered transport of phenol or related compounds.
Transcriptomic analysis of phenol-adapted strains
Stress adaptation in evol33 and evol40 compared to WT
To investigate tolerance mechanisms further, we employed a comparative transcriptomic approach by sequencing ribosomal RNA-depleted total RNA from WT, evol33 and evol40 in the three medium conditions described in Materials and Methods. High phenol (1.5 g/l) was used for adapted strains only, since the wild-type strain did not reach adequate cell densities at this culture condition. High phenol condition data were used to assess whether the up- or downregulation of genes was phenol concentration-dependent. Three biological replicates of each strain and condition showed consistent gene expression results (Figure 3 and Supplementary Figure S7; in Figure 3, each square on a heat map represents a biological replicate). In addition to coping with the low nutrient-culture conditions (cells were grown in minimal media with low nitrogen to promote lipid accumulation), cells came in contact with phenol, which is generally toxic to cell membranes and cellular macromolecules (60,61). As expected, multiple putative stress genes were upregulated due to nitrogen-limited growth conditions and the toxicity of phenol. Some of the annotated stress genes that were upregulated in all strains in phenol growth conditions were ‘DNA protection during starvation protein’ LPD04106, ‘heat shock protein 20 homolog’ LPD07748 and ‘molecular chaperone DnaK’ LPD02079 (Table 2 and Supplementary Table S6, Sheet 13). LPD04106 was highly expressed in glucose, but even more upregulated in low phenol in WT and evol40 (Table 2 and Supplementary Table S6). evol33 showed modest upregulation of LPD04106 (2.64-fold) in low phenol. In adapted strains in high phenol, expression levels of LPD04106 were similar to that of glucose (no change in evol33 and 1.3-fold downregulation in evol40). LPD07748 was highly upregulated in low phenol in all strains, but to a less extent in adapted strains in high phenol compared to low phenol (Table 2 and Supplementary Table S6). LPD02079 was 16-fold upregulated in WT in phenol, while it was 3.1- to 6.3- fold upregulated in adapted strains in phenol (Table 2 and Supplementary Table S6). These results suggest that adapted strains show less stress response to phenol-containing media than WT.
Summarized log2 fold changes of selected genes (over glucose condition). WT LowP, WT in low phenol versus glucose; evol33 LowP, evol33 in low phenol versus glucose; evol33 HighP, evol33 in high phenol versus glucose; evol40 LowP, evol40 in low phenol versus glucose; evol40 HighP, evol40 in high phenol versus glucose. For DESeq2-normalized count data, counts are shown in each condition specified, rather than comparing to glucose (Glc, 1 g/l glucose; LowP, 0.75 g/l phenol; HighP, 1.5 g/l phenol). Count numbers were obtained using plotCounts function of DESeq2 and averaging triplicate samples. More information of this chart can be found in Supplementary Table S6.
Phenol degradation and utilization genes were highly upregulated in adapted strains
Genes that were more than 1024-fold upregulated in phenol compared to glucose across all strains were LPD06575 and LPD06576, which are annotated as ‘flavin-dependent monooxygenase, reductase subunit/phenol monooxygenase reductase’ and ‘4-hydroxyphenylacetate 3-monooxygenase’, respectively. In addition to these two genes, LPD06740, LPD06741 and LPD06742 were also more than ∼1000-fold upregulated in phenol in evol33 and evol40 (Figure 3, Table 2, and Supplementary Figure S8). LPD06740 and LPD06741 have high identity to LPD06575 and LPD06576, and are in the same orientation in tandem (Figure 4B). LPD06742 is annotated as catechol 1,2-dioxygenase (Figures 3 and 4). WT did upregulate the LPD06740 operon in phenol, but the expression levels were 3- to 18-fold lower than that of the adapted strains (Figure 3, Table 2, and Supplementary Figure S8). LPD06575/LPD06740 and LPD6576/LPD06741 are likely to be small and large components of two-component phenol hydroxylase, where LPD06575/LPD06740 are flavin reductases and LPD6576/LPD06741 are NADH-dependent phenol hydroxylases, as there are reports of biochemically proven two-component phenol hydroxylases in other Rhodococcus species (62,63). Gene duplication of this region is also documented, and R. opacus strain 1CP is thought to have a third copy of the genes (59). Based on our transcriptional data, R. opacus appears to utilize both copies of the two-component phenol hydroxylase genes, but the adapted strains have the ability to upregulate the second copy (phenol degradation cluster #2; Figures 3 and 4B) more than the WT strain. Notably, the second copy of the phenol hydroxylase genes is directly upstream of a catechol 1,2-dioxygenase gene (Figure 4B), which is the first gene in the catechol branch of the β-ketoadipate pathway (Figure 4A; 12). Both copies were tightly regulated by the presence of phenol, and AraC-family transcriptional regulators are found in the same operons (Figures 3 and 4B). Moreover, other downstream genes in the β-ketoadipate pathway, such as muconolactone delta isomerase and muconate cycloisomerase, are upregulated more in the adapted strains, especially in high phenol (∼2000-fold; Figures 3 and 4A), and they are located in the same operon as IclR-family regulators (LPD05454, LPD06569 and LPD06698; Figures 3 and 4A). We searched for genomic mutations near differentially regulated operons such as the phenol degradation cluster #2, but the closest SNP observed was over 2 Mbp upstream of differentially regulated genes in adapted strains, suggesting that the differential gene expression could be affected by an indirect cause, such as intracellular concentrations of regulatory molecules or metabolites.
RNA-Seq also revealed that R. opacus upregulates the gentisate pathway of aromatic degradation and utilization when phenol concentration was elevated to 1.5 g/l, while predicted meta-cleavage aromatic degradation pathway genes and anaerobic aromatic degradation genes did not seem utilized (Supplementary Figures S9 and S10). These results strongly suggest that adapted strains efficiently detoxify phenol mainly via the ortho-cleavage pathway and utilize it as a carbon source through the β-ketoadipate pathway (12).
There are over 380 genes annotated as transcriptional regulators or regulatory proteins in R. opacus. Of those, ten genes are annotated as ‘probable thc operon regulatory protein’ (AraC-family transcriptional regulator). Of the ten genes, LPD06577, LPD06739 and potentially LPD06574 are putative regulators of the highly phenol-responsive phenol degradation genes based on the proximity of the genes to phenol degradation genes and expression profiles (Figure 3). A gene annotated as ‘xylose repressor’ (LPD06565) may be involved in regulating the β-ketoadipate pathway cluster #2 because it is co-regulated with the genes in the cluster. β-ketoadipate pathway operons seem to be associated with genes annotated as Pca regulon regulatory protein (IclR family transcriptional regulator; LPD05454, LPD06569 and LPD06698). Although the gentisate pathway operon was adjacent to an AraC-family transcriptional regulator (LPD03777), their expression patterns were not similar (Supplementary Figure S9). However, their expression patterns were similar to an annotated serine/threonine-protein kinase gene (LPD03770) downstream of the gentisate pathway genes (Supplementary Figure S9).
Genes most downregulated by phenol in WT were LPD02697 (NDMA-dependent methanol dehydrogenase/lactaldehyde reductase; >6500-fold downregulated) and two hypothetical protein genes downstream of LPD02697 (Table 2 and Supplementary Table S6, Sheet 14). Adapted strains did not show dramatic downregulation of this gene in low phenol, but did show strong downregulation in high phenol (Table 2 and Supplementary Table S6, Sheets 15-18). A homolog of LPD02697 is implicated in oligotrophic growth of Rhodococcus erythropolis N9T-4 (64). In all strains, methane monooxygenase component genes were among the most downregulated in phenol conditions.
When comparing the low phenol and glucose conditions in terms of global regulatory changes, evol33 had higher average levels of upregulation from fewer genes than WT and evol40, and both adapted strains had fewer downregulated genes than WT, which were also generally downregulated to a lesser extent than WT (Supplementary Tables S6 and S7 and Supplementary Figure S11).
Upregulation of transporter genes in phenol
While genomic data suggested that phenol transport could be altered in the adapted strains, it did not give a complete picture about how the adapted strains could tolerate and utilize higher concentrations of phenol compared to WT. Significant expression differences were not observed between WT and adapted strains in transporter genes identified to have SNPs by whole-genome sequencing. However, a few other putative transporter genes were observed to have altered expression levels (Table 2 and Supplementary Table S6). We identified a highly upregulated gene in both adapted strains (401-fold and 102-fold upregulated in evol33 and evol40, respectively, compared to 5-fold in WT in low phenol, and upregulated to higher degrees in high phenol in adapted strains), which is annotated as a shikimate transporter gene (LPD06699). Shikimate is a cyclohexene compound that is a precursor for aromatic amino acids, and it contains functional groups similar to many phenolic compounds derived from lignin (65). LPD07505, also annotated as a shikimate transporter, was 3.2-fold downregulated in WT in low phenol, but was upregulated in both low and high phenol in adapted strains, especially in evol33 (8.5-fold compared to glucose in high phenol; Table 2 and Supplementary Table S6). No other genes annotated as shikimate transporters showed similar upregulation patterns, and shikimate utilization genes were not upregulated in phenol (Supplementary Table S6, Sheet 12). Moreover, phenol was more quickly consumed by the adapted strains than WT (Supplementary Figure S3), suggesting that increased phenol uptake may play a role in phenol tolerance and utilization in R. opacus.
Lipid biosynthesis and metabolism genes were differentially regulated in adapted strains.
R. opacus TadA is a heparin-binding hemagglutinin-like protein required for normal lipid droplet morphology, which is thought to colocalize with lipid droplets (48,66). Interestingly, the gene encoding TadA (LPD06283 (48,66)) was upregulated in phenol in all strains (Supplementary Table S6, Sheet 13; >24-fold in WT and >27-fold in evol33). This finding prompted us to speculate that lipid droplet morphology may be different in R. opacus strains depending on the carbon source. However, there was no marked change in size or number of lipid droplets when comparing cells in glucose to cells in phenol (Supplementary Figure S12). LPD05549, annotated as a fatty acid synthesis gene (FASI), was upregulated 33- to 64-fold in all strains in phenol compared to glucose. 8 out of the 11 genes annotated as stearoyl-CoA 9-desaturase or stearoyl-CoA 9-desaturase electron transfer partner genes were at least 4-fold upregulated in WT in phenol, whereas 24 of the 45 genes annotated as fatty acid –CoA ligase were greater than 2-fold downregulated in phenol in WT (19/45 genes in evol33 and 22/45 genes in evol40 were >2-fold upregulated), although the degree of fold change differed significantly between WT and adapted strains (representative genes shown in Table 2 and Supplementary Table S6, Sheet 13).
Effects of nitrogen limitation on R. opacus transcriptome
In glucose, strain-to-strain variation in global expression was low (Supplementary Figures S7 and S13), and the most highly expressed genes assessed by DESeq2 were LPD00472 (porin protein), LPD04147 (linear gramicidin synthase subunit C/non-ribosomal peptide synthase), LPD03031 (nitrite extrusion protein) and LPD03032 (nitrite reductase; Supplementary Table S6). LPD00472 was a highly abundant transcript and was slightly downregulated (up to 4-fold) in phenol growth conditions, probably reflecting the hydrophobic nature of phenol compared to hydrophilic glucose. Normalized counts using plotCounts function of DESeq2 comparing these genes, phenol degradation genes and constitutively highly expressed housekeeping genes (such as DNA-dependent RNA polymerase subunits (LPD06131/6132), ClpX (LPD05487), and elongation factor Tu (LPD06091)) are shown in Table 2. Because we used a minimal salts medium with a reduced nitrogen concentration, we expected that genes involved in nitrogen limitation would be highly expressed. This speculation was confirmed by the extremely high expression of LPD03031 and LPD03032 (Table 2). In fact, some of the most highly expressed genes across all strains in all medium conditions included proteasome genes LPD04933 (Mpa homolog/ATPase) and LPD04928 (proteasome endopeptidase complex beta subunit). As shown in R. erythropolis (67), R. opacus is likely equipped with a complete set of components to constitute active proteasomes, where proteins are presumably recycled and utilized in nitrogen-limited conditions. Changes in carbon source (glucose or phenol) did not seem to affect proteasome gene expression (Table 2; less than 2-fold up- or downregulated).
In this work, we analyzed genomic and transcriptomic changes between adaptively evolved R. opacus mutants and the WT strain grown in phenol. Laboratory adaptive evolution has been used for many different bacteria to improve growth on diverse compounds (68). We used phenol as a sole carbon source in our adaptive evolution process and identified strains with higher tolerance to the toxic chemical by measuring their total biomass accumulation, lag phase and growth rate (Figure 1 and Supplementary Tables S1–S3). After ∼200 generations, adapted strains contained 3–4 SNPs in the genomes, which was more rapid accumulation of SNPs than reported by Lenski et al. in their long-term evolution experiment and other labs (8.9 × 10−11 per base-pair per generation; 69). Strong selective pressure by the toxic compound being the sole carbon source probably contributed to the faster rate of SNP accumulation compared to the E. coli experiment, although the ‘normal’ rate of SNP accumulation during adaptive evolution in R. opacus has not been determined. Unfortunately, we did not identify ‘obvious’ SNPs that contributed to the striking transcriptomic profile difference between WT and evolved strains. The two adapted strains had improved growth profiles with increased biomass accumulation and shorter lag phases that could be due to altered transport of phenol in the mutants, leading to changes in intracellular concentrations of phenol and thus expression of phenol-responsive genes for phenol detoxification and degradation.
Differences were also observed between WT and adapted strains in terms of lipid accumulation. From RNA-Seq expression data, phenol appears to be mainly degraded into acetyl-CoA and succinyl-CoA via the β-ketoadipate pathway (12). Acetyl-CoA can be integrated directly into fatty acid biosynthesis pathways, and succinyl-CoA can enter the TCA cycle to generate reducing equivalents and ATP for cell growth and lipid synthesis (70). Compared to WT, the adapted strains maintained the ability to synthesize and accumulate lipids when grown in glucose, but they both accumulated significantly more lipids than WT when grown in phenol (Figure 2, Supplementary Tables S4 and S5).
The adapted strains in this study demonstrated high phenol consumption capabilities that match or surpass other phenol-degrading bacterial strains and mixed cultures. Pseudomonas putida MTCC1194, which was adapted for growth in phenol as a sole carbon source, had maximum phenol degradation rates of ∼12 mg phenol/l/h during growth at its maximum phenol concentration of 1 g/l (71). Another approach used mixed cultures to increase phenol degradation capacity, with tolerances up to 0.8 g/l phenol and degradation rates of 15.4 mg/l/h (72). Another approach is to isolate strains directly from environments such as contaminated wastewater. Bacillus brevis, isolated from an industrial wastewater, showed the highest phenol tolerance and utilization at concentrations up to 1.75 g/l phenol with degradation rates of ∼20 mg/l/h (73). Our adapted strains showed higher tolerance than B. brevis with growth in 2 g/l phenol, and experiments at 1.5 g/l showed phenol degradation rates of 22 and 21 mg/l/h for evol33 and evol40, respectively (Supplementary Figure S3). In summary, evol33 and evol40 demonstrated comparative phenol degradation rates and tolerances to the best-performing environmental isolates.
Our transcriptomic data suggested that phenol-adapted strains express lower levels of stress-response genes in the conditions tested (low nitrogen, minimal media and phenol as a sole carbon source). Phenol is known to cause oxidative stress response in bacteria (74). In our study, R. opacus upregulated ‘DNA protection during starvation protein’ LPD04106, or Dps, which is thought to be involved in oxidative stress resistance (75,76). It seems likely that low nutrient condition can induce the gene, but the adapted strain, especially evol33, showed significantly lower upregulation of this gene compared to WT in phenol, suggesting that evolved strains sensed less oxidative stress in our experiment. A similar trend was observed with the major housekeeping chaperone genes. DnaK is a major bacterial Hsp70 (70 kDa heat shock protein) and functions with co-chaperones GrpE and DnaJ to prevent aggregation of denatured proteins (77,78). R. opacus expressed 2 copies of the dnaK operons (LPD2079–02081 and LPD03845–03847) in the conditions we tested, and the major operon appeared to be LPD02079 and the following 3 genes. All DnaK and co-chaperone genes were upregulated more in WT than in the phenol-adapted strains in low phenol, indicating that the low-nutrient, phenol medium condition encountered by WT may cause proteins to misfold, and that the phenol-adapted strains encounter lower level of cellular stress than WT at the same concentration of phenol in the media.
We observed clear patterns of altered gene regulation in phenol-adapted strains compared to WT R. opacus in response to phenol. In WT, one of the two copies of phenol hydroxylase genes was dominantly upregulated, but in adapted strains the second copy (LPD06740 and LPD06741) was also upregulated to the same or even higher level (Figure 3, Table 2 and Supplementary Figure S8). Both SNP analysis and RNA-Seq results suggest that transport of phenol or compounds related to its metabolism is altered in the adapted strains. In fact, phenol was consumed at faster rates by the adapted strains than the WT strain (Supplementary Figure S3). evol33, which additionally had a putative transporter gene with a single point mutation compared to evol40 (Table 1), upregulated phenol degradation genes more strongly than evol40 (Figure 3). It is well documented that many identified transcriptional regulators for aromatic degradation and catabolism are AraC/XylS family (79–81). AraC/XylS-family transcriptional regulators bind to ligands (e.g. arabinose), and upregulation of effector genes is tunable by adjusting the concentration of ligands (82–84). Therefore, it is possible that adapted strains allow more phenol molecules to be transported into the cell, which in turn activate the second phenol degradation operons to the degree observed (6307-fold upregulation in evol33, 2624-fold upregulation in evo40 and 657-fold upregulation in WT; Figure 3 and Supplementary Table S6). Higher upregulation of β-ketoadipate pathway genes in adapted strains could also be influenced by higher intracellular concentrations of phenol or its downstream metabolite, catechol, which has been observed with aromatic degradation pathways in other organisms (79,80).
There are 46 genes annotated as shikimate transporters in the complete genome of R. opacus. Of the 46, LPD06699 was uniquely co-upregulated with phenol degradation genes, at 353- and 103-fold in evol33 and evol40 comparing low phenol to glucose, and 937- and 572-fold in evol33 and evol40 comparing high phenol to glucose, respectively. In WT, LPD06699 was upregulated 5-fold in low phenol compared to glucose. Except for LPD07505, which was upregulated in phenol only in adapted strains, none of the other genes annotated as a shikimate transporter were regulated in the same manner (Supplementary Table S6). Furthermore, predicted shikimate utilization genes LPD05461, LPD06023, LPD17011 and LPD05485 were not highly transcribed or upregulated in phenol (Supplementary Table S6), suggesting that LPD06699 and potentially LPD07505 may be involved in phenol transport.
Another interesting finding was the upregulation of a putative YRNA and Rsr (85) in adapted strains in phenol. YRNAs were recently found to exist in a wide variety of bacterial species, including R. opacus (86). R. opacus Rsr homolog, LPD06269, caught our attention because of its phenol-responsive upregulation in evolved strains, even though it was annotated as a hypothetical protein. LPD06269 was constitutively expressed at a medium-low level in all strains in glucose, but upregulated 52-fold in low phenol compared to glucose and 115-fold in high phenol compared to glucose in evol33 (8.3-fold and 17-fold in evol40), and upregulation of the predicted YRNA region showed the same pattern (Supplementary Figure S14). In WT, transcript levels of LPD06269 slightly decreased in low phenol (0.87-fold), and no upregulation of YRNA region was observed, suggesting that this response was phenol-dependent and specific to the adapted strains. We are in a process of focusing on small noncoding RNAs in R. opacus, since we removed the majority of small RNAs from the total RNA samples in order to obtain maximal amount of mRNA in this study. Nonetheless, our results indicate that small RNA-mediated global response may also be involved in adaptive evolution of R. opacus against phenol as a sole carbon source, and it may explain the ability of R. opacus to rapidly adapt to diverse environments without accumulating substantial genomic changes.
This work suggests that characterizing funneling pathways (which convert lignin-derived compounds into pathway metabolites such as catechol; 87) and transporters for phenolics is important for conversion of phenolics into fuels and chemicals. Genes converting phenol to catechol were some of the most highly upregulated genes when strains were grown in phenol compared to glucose, suggesting that phenol-to-catechol conversion may be a limiting step for growth on phenol. Under the condition that we tested (low nitrogen, minimal medium and aerobic condition), phenol seems to be degraded mainly by ortho-cleavage via the catechol branch of the β-ketoadipate pathway into succinyl-CoA and acetyl-CoA, consistent with our previous result obtained from WT cultures grown in 0.5 g/l phenol supplemented with 13C glucose (37). Other phenolic compounds would require other funneling enzymes to convert them into precursors such as catechol and protocatechuic acid for degradation (87). This study also identified putative transporters for phenol, and transport of many different phenolic compounds would require either promiscuity for transporters or different transporters for each compound. Note that while previous studies on transporters of toxic chemicals focused on efflux pumps to minimize the intracellular concentrations of these compounds (88–90), R. opacus is likely to use its transporters to import toxic phenolics, suggesting a new mechanism of bacterial aromatic tolerance. Increased phenolic compound flux into the cell requires balancing between transport and degradation to prevent accumulation of toxic compounds within the cell.
A recent study demonstrated that adaptive evolution can increase R. opacus tolerance to lignocellulose-derived inhibitors (40). While they used media supplemented with glucose, our evolution process is different in that phenol was used as a sole carbon source to obtain highly tolerant mutants to phenol. Consequently, we were able to decouple the effects of other potentially available carbon sources in lignocellulose-derived feedstock on changes in the genome and transcriptome. We also demonstrated that phenol as a sole carbon source can support R. opacus for both growth and accumulation of lipids (biodiesel precursors) and that adaptive evolution can increase lipid productivity, an important factor for current lignin conversion processes (15). Combining adaptive evolution and omics analyses, our approach provides insights into the tolerance mechanism of the promising production host, facilitating future lignocellulose, specifically lignin, valorization efforts.
Supplementary Data are available at NAR Online.
The authors thank Merima Beganovic and Yu Gao for scientific discussions, Dr Wandy Beatty for performing transmission electron microscopy and assisting with fluorescence microscopy and Jessica Hoisington-López for her Illumina sequencing expertise.
U.S. Department of Energy [DE-SC0012705 to G.D., T.S.M. and M.F.]. Funding for open access charge: U.S. Department of Energy [DE-SC0012705].
Conflict of interest statement. None declared.