Multiplexed single-cell profiling of chromatin states at genomic loci by expansion microscopy

Abstract Proper regulation of genome architecture and activity is essential for the development and function of multicellular organisms. Histone modifications, acting in combination, specify these activity states at individual genomic loci. However, the methods used to study these modifications often require either a large number of cells or are limited to targeting one histone mark at a time. Here, we developed a new method called Single Cell Evaluation of Post-TRanslational Epigenetic Encoding (SCEPTRE) that uses Expansion Microscopy (ExM) to visualize and quantify multiple histone modifications at non-repetitive genomic regions in single cells at a spatial resolution of ∼75 nm. Using SCEPTRE, we distinguished multiple histone modifications at a single housekeeping gene, quantified histone modification levels at multiple developmentally-regulated genes in individual cells, and evaluated the relationship between histone modifications and RNA polymerase II loading at individual loci. We find extensive variability in epigenetic states between individual gene loci hidden from current population-averaged measurements. These findings establish SCEPTRE as a new technique for multiplexed detection of combinatorial chromatin states at single genomic loci in single cells.


INTRODUCTION
Proper regulation of genome activity and architecture is critical for development, growth, and function of a multicellular organism (1,2). Regulation occurs in large part at the nucleosome, where ∼147 bp of DNA wrap around an octamer of 4 different histone pairs: H2A, H2B, H3 and H4 (3). Various residues found at the N and C-terminal tails of these histones can acquire post-translational modifications, such as acetylation and methylation, which grant nucleosomes the ability to either participate in organized compaction of chromatin or to recruit transcriptionally relevant protein complexes (4,5). Researchers have therefore suggested that these modifications, also known as histone marks, act as a code for the epigenetic state of genomic regions (6,7). Although several sequencing-based methods are available for studying distinct histone modifications (i.e. ChIP-seq) (8,9), chromatin accessibility (10,11), genomic contact frequencies (12,13), and genomic nuclear locations (14), these methods are either unable to resolve cell-to-cell variations or are limited to studying one histone modification at a time. Therefore, the role these marks play in controlling chromatin structure and gene expression at the single cell and single locus level remains poorly understood and vigorously debated.
To tackle this problem, super-resolution fluorescence microscopy techniques have been used to observe more closely how histone marks impact chromatin organization within a cell's nucleus. Using Stochastic Optical Reconstruction Microscopy (STORM) (15,16), researchers saw that nucleosomes form clusters that vary in size and nuclear distribution depending on a cell's developmental stage or what histone marks they present (17,18). Others have combined STORM with DNA Fluorescence in situ hybridization (FISH) to map spatial aspects of genomic loci with a spatial resolution comparable to the observed sizes of these nucleosomal clusters (19). Collectively, these studies suggest that concurrent visualization of DNA and histone modifications with super-resolution microscopy could enable profiling chromatin states at the level of single loci. However, most studies to date have viewed histone marks and genes separately, because combining immunofluorescence and DNA FISH can be challenging due to the harsh solvents and/or high temperatures used in FISH protocols (20)(21)(22)(23). Although researchers have visualized immunolabeled histone marks across whole chromosomes (21,22), or at repetitive and highly abundant ALU elements regions labeled with an alternative hybridization strategy (24), there are still no methods available to study multiple histone marks at individual non-repetitive genomic loci at the level of individual nucleosomal clusters. A better understanding of histone mark heterogeneity at individual loci would require a new method capable of further decoupling immunofluorescence and FISH labeling.
We therefore developed a new method, called Single Cell Evaluation of Post-TRanslational Epigenetic Encoding (SCEPTRE), which uses expansion microscopy (ExM) (25,26) to combine DNA FISH with immunofluorescence and quantify histone mark fluorescence signals at individual loci within the nucleus. ExM preserves the signal of antibody labels on protein structures by covalently linking antibodies and proteins to a swellable hydrogel that is grown within the sample (25,26). This signal preservation enables subsequent use of relatively harsh conditions, such as high temperatures and organic solvents, for labeling of genomic DNA by FISH without loss of the antibody signal. At the same time, ExM enables the isotropic expansion of specimens with low distortion so that these specimens may be examined with a high spatial resolution (here ∼75 nm) in the expanded state even when using conventional microscopes with a diffraction-limited resolution of ∼250 nm. We demonstrate the capabilities of SCEPTRE for a variety of systems: (i) we compared signals of multiple histone marks at a housekeeping gene locus; (ii) we distinguished histone mark signals between developmentallyregulated genes in a single cell; (iii) we demonstrate a correlation between histone marks and paused RNA polymerase II in a single region. Together, these experiments establish SCEPTRE as a powerful tool to study the role histone marks have at individual genes within the nuclei of single cells.
Alpha-satellite, GAPDH set, adapter and conjugated reporter oligonucleotide probe sets were obtained from Integrated DNA Technologies (IDT). A Precise Synthetic Oligo Pool (SC1966-12) containing probes covering the MYL6, HOXC and LINC-PINT regions was obtained from Gen-Script (for a list of sequences, see supplementary spreadsheet).

Secondary antibody fluorophore conjugation
Conjugation was performed by mixing 40 l of a secondary antibody solution at a concentration of 1.3 mg/ml with 5 l of a 1 M sodium bicarbonate solution, then adding 2-5 g of an NHS ester functionalized fluorophore. The mixture was left to react for 30 min protected from ambient light and the crude reaction mixture was passed through a NAP-5 column (GE Healthcare Life Sciences, 17085301) for collection and purification of the fluorophore-conjugated secondary antibody. Further characterization of the secondary antibody was done by ultraviolet/visible absorption spectroscopy.

Immunostaining procedure
The immunostain procedure was adapted from previous protocols (17,18), and goes as follows: fixed RPE1 cells were incubated first in permeabilization solution (1× PBS with 0.1% (v/v) Triton X-100) for 10 min, then washed three times with 1× PBS. After permeabilization, cells were incubated in block solution (1× PBS with 10% (w/v) BSA and 3 mM sodium azide) for 1 h at room temperature, followed by incubation in primary solution (2-5 g/ml of primary antibodies diluted in block) overnight at 4 • C. The sample was washed with block three times (10 min each time), then incubated in secondary solution (2-3 g/ml of secondary fluorophore-conjugated antibodies in block) for 1-2 h at room temperature. The sample was washed once for 10 min with block, then three times with 1× PBS azide. Samples which had been originally fixed in EtOH:MeOH were postfixed in 4% PFA in 1× PBS for 10 min, then washed three times with 1× PBS azide. Immunostained samples were either immediately gelled or stored in 1× PBS azide at 4 • C for up to ∼1 week for later use (see Supplementary table S1 for more details).

Cell gelation, digestion and expansion
Expansion microscopy was adapted from a previous protocol (26), and goes as follows: immunolabeled cells were treated with freshly prepared 5 mM MA-NHS in 1× PBS for 10 min, then washed three times with 1× PBS. Cells were incubated in monomer solution (1× PBS with 2 M NaCl, 2.5% (w/w) acrylamide, 0.15% (w/w) N,N'methylenebisacrylamide and 8.625% (w/w) sodium acrylate) for 10 min before gelation with 0.15-0.2% (w/v) APS and 0.2% TEMED (w/w) at room temperature for at least 30 min in a sealed container backfilled with nitrogen gas. After polymerization, the cell-embedded hydrogel was gently removed from the 12 mm coverslip, then incubated in digestion solution (1× TAE with 0.5% (v/v) Triton X-100, 0.8 M guanidine HCl and 8 units/ml proteinase K) overnight at 37 • C. The digested sample was both washed and expanded by placing the sample in deionized water, which was replaced every 15-20 min for at least three times. Hydrogels were stored in 2× SSC at 4 • C, typically up to ∼1 month.

DNA fluorescence in situ hybridization
The general DNA FISH procedure for non-repetitive genomic regions (GAPDH, MYL6, HOXC and LINC-PINT) was adapted from previous protocols (30,31), and goes as follows: Briefly, a small (∼3.5 mm × 3 mm × 2 mm) piece of gel from each expanded cell sample was first incubated in hybridization buffer (2× SSC with 50% (v/v) formamide and 0.1% (v/v) Tween 20) for 10 min at room temperature. Samples were incubated in pre-heated hybridization buffer for 30 min at 60 • C. A hybridization mixture (2× SSC with 50% formamide (v/v), 10% dextran sulfate (w/v), 0.1% (v/v) Tween 20, 3 mM sodium azide, ∼10-20 nM oligo probe library per kb of targeted genomic region, and 1-1.5× concentration of oligo reporters and adapters to oligo probe library) specific to each sample was preheated to 90 • C for 5-10 min and then added to each sample at an approximate 2:1 volume ratio. Samples were denatured at 90-92.5 • C for 2.5-10 min and hybridized at 37-42 • C overnight. Samples were washed three times, 15 min each time: first with preheated 2× SSCT (2× SSC with 0.1% (v/v) Tween 20) at 60 • C, then with preheated 2× SSCT at 37 • C, and lastly with 2× SSCT at room temperature. Samples were stored at 4 • C in 0.2× SSCT (0.2× SSC with 0.01% (v/v) Tween 20) until needed (within a week). Samples were fully expanded to ∼4× the original size with deionized water at 4 • C, replacing the water twice every 10 min (see Supplementary table S1 for more details).
The DNA FISH procedure for the repetitive alphasatellite region was done as follows: expanded RPE1 cells were incubated for 1 h at room temperature in 1× PBS. The sample was then incubated in 1× PBS supplemented with 100 g/ml of RNase A for 1 h at 37 • C. After RNA digestion, the sample was incubated in 2× SSCT for 30 min at e82 Nucleic Acids Research, 2021, Vol. 49, No. 14 PAGE 4 OF 15 room temperature. The samples were then incubated in hybridization buffer for 30 min at room temperature. The gel was transferred to a hybridization buffer containing 200 nM of alpha-satellite oligonucleotide probe. The sample was denatured for 15 min at 95 • C. Gels were washed once in 20× SSC for 15 min at 37 • C, then in 2× SSC for 1 h at 37 • C. The samples were incubated in 2× SSC with 200 nM alphasatellite adapter probe and 600 nM of reporter probe A for 30 min at 37 • C. The sample was washed with 20× SSC for 20 min at 37 • C and lastly with 2× SSC for 20 min at room temperature. After this, the alpha-satellite sample was expanded to ∼3× the original size by incubating the sample in 0.2× SSC, then a second time in 0.2× SSC with 1 g/ml of Hoechst 33258 (see Supplementary table S1 for more details).

Sample mounting and imaging
For expanded samples using Alexa Fluor 750 fluorophoreconjugated reporters, samples were incubated in imaging buffer (10 mM Tris buffer (pH 8) with 1 mM Methyl viologen, 1 mM ascorbic acid and 2% (v/v) MeOH) for 10 min. Before imaging, samples were first adhered to a poly-L-lysine-coated rectangular no. 1.5 coverslip, then they were supplemented with ∼30 units/ml alcohol oxidase and 0.2% (w/v) catalase. Samples that did not have Alexa Fluor 750 were adhered to a poly-L-lysine-coated rectangular no. 1.5 coverslip. All samples were imaged with either a Leica SP5 inverted confocal point scanning microscope at the University of Washington Biology Imaging Facility using a Plan Apo CS 63× 1.2 numerical aperture (NA) water-immersion objective, a homebuilt spinning disk confocal microscope using a Nikon CFI60 Plan Apochromat 60× 1.27 NA water-immersion objective, or a conventional wide-field epifluorescence inverted Nikon Ti-S microscope using a Nikon CFI Plan Apo VC 60× 1.2 NA water-immersion objective (see Supplementary table S1 for more details).

CUT&RUN H3K4me3, H3K27me3 and H3K27ac profiling
CUT&RUN was performed as previously described (29), with the following adaptations: 250 000 trypsinized RPE1 cells were used per antibody condition. Cells were bound to Concanavalin A coated magnetic beads (Bangs Laboratories, BP531), permeabilized with 0.025% (w/v) digitonin, then incubated overnight with 5 g of either anti-H3K4me3 (Active Motif, 39159), anti-H3K27me3 (Active Motif, 39155) or anti-H3K27ac (Active Motif, 39133) in 500 l of solution at 4 • C. Cells were washed then incubated with protein A-MNase fusion protein (a gift from S. Henikoff, FHCRC) for 15 min at room temperature. After another wash, cells were incubated with 2 mM calcium chloride for 30 min at 0 • C to induce MNase cleavage activity. The reaction was stopped by adding 100 l of 2× STOP buffer (200 mM NaCl, 20 mM EDTA, 4 mM EGTA, 50 g/ml RNase A, 50 g/ml glycogen and 2 pg/ml of yeast spike-in DNA) to each sample. Cleaved histone-DNA complexes were isolated by centrifugation and DNA was extracted with a NucleoSpin PCR Clean-up kit (Macherey-Nagel, 740609).
Library preparation for each CUT&RUN antibody condition was done with a KAPA Hyper Prep Kit (VWR, 89125-040) with the PCR amplification settings adjusted to have simultaneous annealing and extension steps at 60 • C for 10 seconds. Library products between 200-300 base pairs were selected using Agencourt AMPure XP beads (Beckman Coulter, A63880) then sequenced with an Illumina MiSeq system at the University of Washington Northwest Genomics Center with paired-end 25 bp sequencing read length and TruSeq primer standard for ∼6 million reads per condition.
Paired-end sequencing reads were aligned separately to human (GRCh38/hg38) and yeast genomes using Bowtie2 (32) with the previously suggested specifications for mapping CUT&RUN sequencing data: (29) -local -verysensitive-local -no-unal -no-mixed -no-discordant -I 10 -X 700. Alignment results were converted to BAM files with SAMtools (33) and then to BED files with BEDTools (34). Reads were sorted and filtered to remove random chromosomes, then, with BEDTools genomecov, histograms were generated for the mapped reads using spiked-in yeast reads and the number of cells for each condition as scaling factors. The results were visualized using the WashU Epigenome Browser (https://epigenomegateway.wustl.edu/) (35).

Oligonucleotide probe design and amplification
DNA FISH probes were designed using OligoMiner (36), with standard buffer, length and melting temperature conditions, with the exception of the target MYL6, which had the following adaptations: base length between 28-42 nucleotides and melting temperature between 38 and 46 • C. Orthogonal DNA sequences, which were previously screened for DNA FISH purposes (19,37), were appended to each probe as adapter/reporter hybridizing regions specific to each gene, along with a primer set for amplification. Designed probes were purchased as part of an oligo pool from GenScript, and the probes were amplified using a T7/Reverse-Transcriptase amplification protocol previously published (31), in an RNase-free environment with the following adaptations: After PCR amplification with a Phusion Hot-start master mix and purification with a DNA Clean & Concentrator-5 kit (Zymo Research, D4013), probes were T7 amplified with a HiScribe T7 Quick High Yield RNA Synthesis Kit (New England BioLabs, E2050S) supplemented with 1.3 units/l RNase-OUT (ThermoFisher Scientific, 10777019) for 16 h at 37 • C. DNA was digested with DNase I for 1 h at 37 • C. RNA was purified from the sample by first adding LiCl solution from the HiScribe Kit at a 1:7 ratio to the RNA solution, incubating the solution at −20 • C for 30 min and pelleting the precipitated RNA by centrifugation (∼17 000 g) for 15 min at 4 • C. The supernatant was removed from the tube and the pellet was washed with 70% EtOH. The RNA pellet was centrifuged (∼17 000 g) for 5 min at 4 • C and, after carefully removing the supernatant, the pellet was left to dry for 3 min. The RNA was dissolved in water and

Image processing and analysis for SCEPTRE profiling
Image processing and analysis was performed using MAT-LAB. First, raw images obtained from immunofluorescence channels were smoothed with a gaussian filter using 1-2 standard deviations within a 3 × 3 × 3 matrix. Smoothed images were contrast adjusted, where background pixel levels were clipped at an adaptively determined threshold for each image set at 2-9 third quartiles away from the median of each image stack histogram. The contrast adjusted images were binarized, either by an Otsu method or a Laplace filter with alpha = 0.2 followed by selection of all negative values. A nuclear mask (generated as described below) was applied to the binarized immunofluorescence channel and, after small components (volume < 20 voxels) were removed, a watershed transformation was applied to the segmented clusters. Features, including mean fluorescence intensity of every immunofluorescence channel and overlap with clusters of other segmented immunofluorescence channels, were identified for each segmented cluster (see Supplementary table S2 for more details). For the first image stack, each step was visually inspected to confirm proper threshold levels.
The nuclear mask was generated by applying the same segmentation process from above to either a Hoechst stain channel or the same immunolabeled channel with a contrast adjustment done with 1-3 third quartiles clipping. The segmented channel was subject to morphological dilation with a sphere of 3 pixel radius, to fuse clusters within the nucleus. The convex hull was computed for the largest component (i.e. the nucleus) after all other clusters were removed. The segmented nucleus was morphologically eroded with the same sphere that was used to morphologically dilate the channel (see Supplementary table S2 for more details). For the first image stack, each step was visually inspected to confirm proper threshold levels.
After segmentation of the nuclear channel and immunofluorescence channels, the FISH raw channel was segmented in the same manner with the following exceptions: (i) the nuclear mask was applied after smoothing and before contrast adjustment; (ii) clipping during contrast adjustment was performed with a threshold of 10-15 third quartiles away from the median; (iii) no watershed transformation was applied to FISH segmented regions; (iv) clusters intersecting the periphery of the nuclear mask (e.g. highly fluorescent contaminant in FISH channel next to nucleus) were removed. Features, including mean fluorescence intensity of every immunofluorescence channel and overlap with each immunofluorescence segmented cluster, were identified for each segmented FISH cluster. Since the segmented FISH channel can contain small and dim clusters that, by visual inspection, do not correspond to the FISHlabeled genomic loci, small clusters (volume < 20-80 voxels) were filtered out before analyses. After segmentation of the FISH channel, randomly selected cubic regions were generated throughout the nuclear region of each image stack with a volume approximately equal to the mean volume of the selected FISH clusters. Mean fluorescence intensities of each immunofluorescence channel were determined for these random clusters (see Supplementary table S2 for more  details). For the first image stack, each step was visually inspected to confirm proper threshold levels.
Data obtained from the segmented clusters were inspected using contour and scatter plots with MATLAB built-in functions, or violin plots, using the MATLAB script violinplot (https://github.com/bastibe/Violinplot-Matlab). Contours were smoothed with a gaussian filter using 1 standard deviation within a 5 × 5 matrix. Pearson correlation coefficients were determined using the MATLAB function corrcoef.

Statistical analyses
Each figure, along with their related supplementary figures, represents an individual experiment where all cells were labeled, expanded, imaged and processed under the same conditions. Cell numbers for each experiment were: 1 (  Figures S17 and S18).
Fluorescence signal, defined as the mean fluorescence intensity for a given immunolabeled channel within a cluster of the same experiment, was used as the main measurement for comparing histone mark or paused RNA polymerase II levels within the segmented clusters of immunolabeled, FISH-labeled or randomly selected regions, or between the distribution of fluorescence signals for each set of clusters. Background fluorescence signal is defined as the mean fluorescence intensity for a given immunolabeled channel within the randomly-selected groups of background voxels for H3K4me3 marks in Figure 3, H3K27me3 marks in Figure 4C, or for both immunolabeled channels in Supplementary Figure S7 and S11. The groups of background voxels were equal in size to the median size of segmented clusters for each channel.
An arbitrary 'on' threshold (equal to the 5th percentile of the fluorescent signal found within a respective immunolabeled cluster set) is represented in all graphs excluding violin plots, as a qualitative determinant of high or low fluorescence signal within each set of clusters. Correlation coefficients were determined for each comparison between fluorescence signals within a set of clusters. A right-tailed Wilcoxon rank sum test was used to determine if for a given experiment the median fluorescence signal of a cluster set was significantly higher than the median signal in randomly selected regions or a separate set. All numbers corresponding to fraction of overlap and distance are represented as mean ± standard deviation.
To better understand the detection limitations for immunostaining a particular histone mark, cells were concurrently labeled with an antibody of interest and an al- ternative antibody, both targeting the same mark. The detection efficiency based on colocalization is defined as the percent of voxels from the clusters of the alternative antibody that intersect with the clusters of the antibody of interest. This colocalization is interpreted as a lower bound measurement of detection efficiency due to the challenges of targeting the same epitope with two competing antibodies. A fluorescence-based analysis was used instead to evaluate the detection efficiency of the antibody of interest. In this analysis, we determine the percent of clusters from the alternative antibody distribution that are above the background (i.e. >99.9% of the background fluorescence signal distribution) for the fluorescence signal of the antibody of interest.

SCEPTRE uses ExM to co-localize immunolabeled proteins at DNA FISH labeled genomic regions
The labeling of individual genomic loci by DNA fluorescence in situ hybridization (FISH) has provided a powerful tool for visualizing chromatin structure in single cells (30,(38)(39)(40). While DNA FISH could be combined with immunofluorescence labeling to enable concurrent visualization of chromatin modification states and associated proteins, integration of these two techniques has been challenging, because the harsh conditions required to melt doublestranded genomic DNA during labeling (e.g., treatment with hot formamide) may remove antibody labels applied before FISH or may compromise the antigenicity of relevant epitopes for post-FISH immunolabeling (Supplementary Figure S1) (20)(21)(22)(23). To overcome this challenge, we employed expansion microscopy (ExM) as a means to preserve the signal of immunolabeled protein structures during DNA FISH labeling. In ExM, immunolabeled structures are covalently linked to a swellable hydrogel polymer scaffold that is isotropically expanded in deionized water in order to reveal features closer than the ∼250 nm diffraction limit of light in the expanded state (25,26). ExM not only provides a high spatial resolution (∼75 nm or better when using a standard confocal microscope with ∼4× expanding gels), but also enables antibody labels to be covalently tethered to the hydrogel scaffold, such that DNA FISH can subsequently be performed without loss of antibody fluorescence. ExM has previously been combined with DNA FISH to either visualize the HER2 gene in tissue (41), or to visualize repetitive centromere regions in plants (42). However, this combination has not yet been used to determine the density of a protein structure, such as histone mark clusters, at specific genomic regions. We refer to this new methodology as Single Cell Evaluation of Post-TRanslational Epigenetic Encoding (SCEPTRE), as a tool to quantify the fluorescence signal of immunolabeled histone marks or proteins structures at individual FISH-labeled genomic loci within individual cells (Figure 1).
To test the ability of SCEPTRE to report on DNAprotein associations within the nucleus, we immunostained centromere-associated proteins while using DNA FISH to co-stain the repetitive alpha-satellite DNA of centromeres ( Figure 2A). ExM images revealed discrete regions corresponding to alpha-centromeres and centromere-associated proteins, as well as significant overlap between these two regions. To quantify this degree of overlap, we created an automated image analysis software routine in MATLAB to segment individual regions corresponding to centromeres and centromere-associated proteins, and then quantified their degree of co-localization (Supplementary Figure S2). From this analysis, we found that DNA-labeled regions had almost complete fractional overlap with centromereassociated proteins (0.97 ± 0.06). Furthermore, the distance between the nearest-neighbor centroids of the protein and DNA labeled regions was small relative to the average radius for either region (77 ± 85 nm versus 234 ± 68 nm (anti-cen.) and 224 ± 65 nm (␣-cen.), respectively). We then quantified the fluorescence signal of labeled centromere-associated proteins at individual centromeric DNA clusters, along with randomly selected regions of comparable size to these FISH clusters ( Figure 2B). While immunofluorescence and FISH-labeled regions maintained similar anti-centromere fluorescence signals, these regions showed much higher signals compared to randomly selected regions. Therefore, due to the high overlap between the FISH-labeled and immunolabeled regions, the high anticentromere signal in the FISH-labeled regions, and the fact that the anti-centromere labeled structures did not shift much between pre-and post-expansion (Supplementary Figure S3), we concluded that ExM can co-localize the signal of protein and DNA components of a genomic region within a nucleus.

SCEPTRE resolves multiple histone modifications at single gene loci in single cells
To determine whether SCEPTRE can distinguish between multiple histone marks at a single, non-repetitive genomic region, we concurrently visualized two histone marks, H3K4me3 and H3K27me3, at the house-keeping gene GAPDH (Figure 3). GAPDH encodes for glyceraldehyde-3-phosphate dehydrogenase, which is highly expressed in many cell types (43) due to its essential role in metabolism; (44) therefore, histone H3K4me3, commonly found at active gene promoters (45), is expected to be present at GAPDH, whereas histone H3K27me3, which is associated with repressed regions (46), is expected to be absent. Using SCEPTRE, we measured the fluorescence signals of immunolabeled H3K4me3 and H3K27me3 marks at the FISH-labeled GAPDH locus, along with the fraction of overlap between GAPDH and H3K4me3 or H3K27me3 clusters (Figure 3).
From this analysis, we observed that GAPDH had much higher H3K4me3 fluorescence signal compared to H3K27me3 signal ( Figure 3E and F). To our surprise, the H3K4me3 signal found at GAPDH varied greatly between loci, with some loci having high signals while others a more baseline level ( Figure 3G). These results were the same when only one of the two histone marks was labeled and imaged with GAPDH (Supplementary Figure S4), or when a different set of antibodies was used to label H3K4me3 and H3K27me3 in RPE1 cells (Supplementary Figure S5).
Interestingly, histone mark signals were uncorrelated between GAPDH alleles in the same cell ( Supplementary Figure S6A-B), suggesting histone mark levels at alleles from the same gene are independently regulated. When comparing these fluorescence signals to those obtained from randomly selected regions across the nucleus, GAPDH showed lower H3K27me3 signals and much higher H3K4me3 signals than those found at random regions. Similar to the fluorescence signal results, the mean fraction of overlap of GAPDH with H3K4me3 clusters was higher than with H3K27me3 clusters (0.21 ± 0.21 versus 0.045 ± 0.11, respectively). To corroborate these results, we measured the density of H3K4me3 and H3K27me3 marks across the RPE1 genome for an ensemble of cells using CUT&RUN followed by sequencing (28,29). Analysis of the CUT&RUN sequencing results revealed that a substantial presence of H3K4me3 marks was found at the targeted GAPDH region and only background levels of H3K27me3 marks were found for this same region ( Figure 3H), with the closest repressed region observed ∼500 kb away. These results demonstrate that SCEPTRE can distinguish between the abundance of two histone modifications at individual non-repetitive genomic regions within a nucleus.
To determine whether the heterogeneity of H3K4me3 marks at GAPDH results from detection limitations of the antibody used above (Rb × H3K4me3), we concurrently labeled H3K4me3 with Rb × H3K4me3 and an alternative antibody targeting the same mark but stained using a different fluorophore (Ms × H3K4me3, Supplementary Figure S7). As an initial estimate of detection efficiency, we calculated the percent of all voxels segmented for the alternative antibody that also segmented for the antibody of interest. This analysis yielded a nominal detection efficiency of 26% for Rb × H3K4me3; however, this value likely under-estimates the true detection efficiency, as the two antibodies likely compete for the same epitope (i.e. the methylated lysine moiety), reducing their ability to co-segment within the same voxel, particularly at high spatial resolution. Therefore, as an alternative measurement of detection efficiency, we directly quantified the cumulative fluorescence signal distribution for Rb × H3K4me3 within clusters segmented with the alternate antibody (Ms × H3K4me3), and we calculated the fraction of this signal above background. This analysis yielded a higher detection efficiency of 88% (Supplementary Figure S7). In contrast, the probability of detecting an antibody signal above background for GAPDH loci was considerably lower, at 67% (Supplementary Figure S7E); thus a substantial degree of heterogeneity in H3K4me3 levels at this locus likely reflects biological variability rather than limitations of antibody detection.
H3K4me3 and H3K27me3 are generally thought to mark distinct chromatin states, though they have been reported to colocalize to form 'bivalent domains' on genes primed for transcription. (47,48) We therefore investigated the relationship between H3K4me3 and H3K27me3 clusters across the nuclei of the RPE1 cells. As previously observed (18), H3K27me3 clusters preferentially inhabited the periphery of the nucleus, whereas H3K4me3 clusters were more evenly distributed ( Figure 3A). There was a low fraction of spatial overlap between H3K4me3 clusters and H3K27me3 clusters (0.079 ± 0.14 H3K4me3 with H3K27me3, 0.12 ± 0.16 H3K27me3 with H3K4me3). The H3K4me3 fluorescence signal in H3K27me3 clusters, as well as the H3K27me3 signal in H3K4me3 clusters, was substantially low, albeit higher than the distribution of random regions (Supplementary Figure S6C-D). We therefore plotted the frequency of H3K4me3 and H3K27me3 signals within each of the other's histone mark's clusters, along with these signals found in randomly selected regions ( Supplementary Figure S8). These plots show that H3K4me3 and H3K27me3 form largely non-overlapping clusters, though there exists a small fraction of clusters having high signal from both histone marks. These results suggest that H3K4me3 and H3K27me3 mostly form disjoint clusters, though a very small fraction may colocalize, similar to what has been observed in other differentiated cell types (49).

SCEPTRE quantifies histone modification levels at multiple genomic loci in single cells
To test whether SCEPTRE can quantify histone mark signals at multiple genomic loci within the same cell, we designed a library of FISH probes to simultaneously label three different genomic regions in RPE1 cells. The first region contains MYL6, a gene on Chr12 encoding myosin light chain-6 that is actively expressed in the eye; (50) the second contains the HOXC gene cluster, which is normally active in progenitors but repressed upon differentiation; (46,51)  non-coding P53 induced transcript (LINC-PINT) variant, a non-coding transcript that is found on Chr7, which is broadly expressed across multiple tissues (Figure 4) (52). Previous RNA-seq results in RPE1 cells (53) indicated that MYL6 is expressed at high levels, that all genes within the HOXC cluster are silent, and that LINC-PINT is transcribed at low levels (Supplementary Figure S3). As expected, bulk analysis of histone modifications at these loci using CUT&RUN revealed H3K4me3 peaks at MYL6 and LINC-PINT, but not within the HOXC cluster. H3K27me3 marks, on the other hand, covered a large region encom-passing the HOXC cluster, but were largely absent from MYL6 and LINC-PINT ( Figure 4E).
In agreement with population-level results, H3K4me3 fluorescence signals measured using SCEPTRE were significantly higher at MYL6 and LINC-PINT than at randomly chosen clusters ( Figure 4B, P < 10 −5 , MYL6; P <10 −6 , LINC-PINT), indicating enrichment of this histone modification at these two loci. Interestingly, both genes showed high H3K4me3 variability between individual loci, similar to what was observed at GAPDH loci. In contrast, H3K4me3 signals at the HOXC cluster were not significantly higher than those found at random regions. Consistently, H3K4me3 clusters showed greater overlap with the MYL6 and LINC-PINT loci than at the HOXC cluster, where it appeared to be visibly excluded (0.23 ± 0.26, MYL6 and 0.20 ± 0.24, LINC-PINT versus 0.068 ± 0.16, HOXC). Therefore, SCEPTRE detects differences in H3K4me3 levels between multiple genomic regions seen with population-level analysis, agreeing with the results obtained by CUT&RUN. We note that H3K4me3 mark signals were largely uncorrelated between two alleles of each gene in each cell (Supplementary Figure S9A), as well as between the loci of different genes in the same cell (Supplementary Figure S10A), indicating that the levels of this histone mark are largely independent across loci within individual cells.
Also in agreement with bulk analysis, The HOXC cluster showed higher H3K27me3 fluorescence signal compared to random clusters, and higher H3K27me3 signal compared to MYL6 or LINC-PINT ( Figure 4D). Strikingly, H3K27me3 levels varied substantially between different HOXC clusters, with a substantial fraction of HOXC clusters with either low or even baseline levels of H3K27me3, even though all genes in this cluster were silent in their expression (Supplementary Table S3). To determine whether this heterogeneity represents limits in H3K27me3 detection versus bona fide biological heterogeneity, we calculated H3K27me3 detection efficiency for this antibody (Rb × H3K27me3, Figure 4C) by co-staining with an alternative antibody, as above (Ms × H3K27me3, Supplementary Figure S11). This analysis yielded a H3K27me3 detection efficiency of 25%, as measured by co-segmentation of the two antibodies, and 72%, determined by quantifying the fraction of Rb × H3K27me3 signal above background for clusters segmented with the alternative antibody (Supplementary Figure S11). In contrast, only 11% of antibody signals at HOXC loci were detected above background (Supplementary Figure S11E), suggesting that the majority of H3K27me3 variability at HOXC reflects innate biological heterogeneity as opposed to technical limitations of antibody detection.
Although MYL6 did not have significantly higher H3K27me3 fluorescence levels compared to random regions, LINC-PINT did, despite an absence of H3K27me3 marking seen in CUT&RUN data. The presence of H3K27me3 at some LINC-PINT loci may reflect the looping of this locus to a different genomic region where H3K27me3 is present; to investigate this possibility, we consulted a previously published Hi-C data set for the RPE1 cell line; (54) indeed LINC-PINT maintained high frequency contacts with an adjacent H3K27me3 domain (Supplementary Figure S12), which may contribute to its lower transcriptional levels compared to MYL6 (Supplementary  table S3). These results suggest that, at the current spatial resolution, adjacent genomic regions can influence each other's histone mark levels detected by SCEPTRE. That being said, the SCEPTRE results broadly agree with the results obtained by CUT&RUN and can distinguish between the chromatin modification states of multiple genes in the same cell (e.g. MYL6 and HOXC). As seen with the H3K4me3 marks at these genomic regions, we observed no relationship between the H3K27me3 levels for two alleles of the same gene (Supplementary Figure S9B) or for alleles from different genes (Supplementary Figure S10B) in the same cell.

H3K4me3 modifications coincide with paused RNA polymerase II at a transcriptionally active locus
H3K4me3 levels have been reported to correlate with active transcription (27) and a model has been proposed where H3K4me3 facilitates the loading of RNA polymerase II, which remains paused proximally to the gene's promoter until a subsequent release step (55). However, this model was based on separate population-level measurements of H3K4me3 and RNA polymerase II, and did not distinguish whether both components coincide directly at the same time at single loci in cells. To test whether both H3K4me3 and paused RNA polymerase II were present simultaneously at GAPDH, we performed SCEPTRE with H3K4me3 and the e82 Nucleic Acids Research, 2021, Vol. 49, No. 14  We detected a large coincidence between H3K4me3 and paused RNA polymerase II, both at the GAPDH locus and also more broadly in the nucleus. At individual GAPDH loci, there were high signals from both H3K4me3 and paused RNA polymerase II ( Figure 5B-E), such that there was also a strong correlation between these signals ( Figure  5F, r = 0.70). Consistently, GAPDH overlapped with both H3K4me3 and paused RNA polymerase II clusters (0.23 ± 0.21 and 0.21 ± 0.19, respectively). Similarly to H3K4me3, paused RNA polymerase II signals were uncorrelated between GAPDH loci in the same cell (Supplementary Figure  S13). In the nucleus more broadly, there was also substantial colocalization between H3K4me3 clusters and paused RNA polymerase II clusters ( Figure 5B, fraction of overlap 0.19 ± 0.21), as well as a strong correlation between these two signals in randomly selected region clusters (Supplementary Figure S14C, r = 0.68).
In contrast, no correlation was seen at GAPDH between H3K4me3 and H3K27me3 signals ( Figure 3G, r = 0.02), or between H3K27me3 and paused RNA polymerase II (Supplementary Figure S15F, r = 0.04). On a broader level, there was also little to no correlation in random regions between H3K4me3 and H3K27me3 (Supplementary Figure  S8C, r = 0.18), or between H3K27me3 and paused RNA polymerase II (Supplementary Figure S16C, r = 0.17). Interestingly, when H3K27ac, another active histone mark (59), was concurrently visualized with paused RNA polymerase II, some correlation was seen between these two signals, with r = 0.43 at GAPDH (Supplementary Figure   S17F), and r = 0.59 at random regions (Supplementary Figure S18C). However, the fraction of GAPDH loci with high H3K27ac signals was smaller compared to that with high paused RNA polymerase II signals, suggesting that H3K27ac and the phosphorylation indicative of paused RNA polymerase II play distinct roles in the transcriptional cycle. Together, these results are consistent with a close regulatory relationship between H3K4me3 modifications and the loading of paused RNA polymerase II, both at GAPDH and more broadly across the nucleus.

DISCUSSION
SCEPTRE is a new method capable of profiling chromatin states at multiple genomic loci within the 3D nuclear context of a cell by combining immunofluorescence with DNA in situ labeling by means of ExM. This combination enables efficient detection of histone mark fluorescence signals at a resolution of ∼75 nm, sufficient to quantify histone mark abundance at individual genomic loci. In contrast to sequencing-based methods, SCEPTRE provides quantitative measurements of physical properties, such as overlap, density, and position within the nucleus for more than one histone mark at multiple genomic regions. Such measurements reveal a heterogeneity in chromatin states that has been previously masked in ensemble sequencing-based methods.
There are limitations to SCEPTRE compared to other histone mark profiling methods. Sequencing based methods can achieve nucleosome level resolution for histone mark mapping across an entire genome, such as in the case of CUT&RUN (28). Since SCEPTRE relies on DNA FISH, detection of genes by in situ labeling often is limited to a PAGE 11 OF 15 Nucleic Acids Research, 2021, Vol. 49, No. 14 e82 A minimum labeling size of over 10 kb, since smaller regions are detected with lower efficiency. However, genome organization is thought to occur at a larger scale than that of the single nucleosome. Nucleosomes are known to organize as clusters throughout the nucleus, with spatial sizes ranging around 50-100 nm (17), a size that corresponds to ∼10 kb of genomic DNA, depending on the region's activity state (19). The scale increases further when observing Topologically Associating Domains (TADs), which are genomic regions of ∼200 kb to 1 Mb in size that maintain similar epigenetic and regulatory landscapes (60,61), or the smaller sub-TADs that are ∼185 kb (62), with spatial sizes of ∼160 nm (63). Even larger than 1 Mb are chromatin A and B compartments which are associated with broader open (active) and closed (repressed) states (64), with spatial sizes on the m scale (65,66). Since SCEPTRE has allowed us to profile multiple genes at the lower scale of this genomic organization, there is potential to build upon this technique in order to target larger genomic regions by using multiplex FISH methods, such as MERFISH (67), seq-FISH (68), ORCA (69) or OligoFISSEQ (70). These methods would allow SCEPTRE to interrogate the relationships between histone modifications and gene activity at a variety of developmentally-regulated genes, and at increasingly larger scales of genome organization. SCEPTRE revealed heterogeneity in the levels of H3K4me3 at active genomic regions such as GAPDH, MYL6 and LINC-PINT beyond what would be expected from technical limitations of SCEPTRE. This variability, which has been suggested in sequencing-based singlecell chromatin profiling studies (9,71), suggests that active gene loci can adopt different states with different levels of H3K4me3 modification. Moreover, because SCEPTRE revealed a close correlation between H3K4me3 and paused RNA polymerase II levels at individual loci, this heterogeneity may reflect differences in the transcriptional state e82 Nucleic Acids Research, 2021, Vol. 49, No. 14 PAGE 12 OF 15 of each gene. Given that genes are transcribed in bursts (72), where polymerase recruitment happens intermittently (73), it is plausible that H3K4me3 marks and polymerase II phosphorylation at Serine 5 are concurrently added during a transcriptional burst, but removed at a later stage in the transcriptional cycle. Moving forward, it would be useful to utilize SCEPTRE to further visualize H3K4me3 and other histone marks alongside different stages of transcription, to elucidate how histone marks participate in the regulation of gene transcription.
Similar to H3K4me3, H3K27me3 also appeared to show substantial heterogeneity in its levels at individual HOXC loci, with a considerable fraction of loci having low or baseline levels of this modification. As the HOXC maintains a transcriptionally silent state in these cells (Supplementary table S3), our results suggest that the HOXC cluster is able to maintain a silent chromatin state even with low or baseline levels of H3K27me3 modification. In line with these findings, our recent study has pointed to a necessity for H3K27me3-independent mechanisms in the maintenance of compacted, polymerase-inaccessible state at repressed developmental gene loci (74). Gene repression at the Hox gene cluster requires PRC1, a protein complex that mediates chromatin compaction and gene silencing (75). PRC1 binds to H3K27me3, an interaction that explains the co-localization of these two factors in the genome; however, PRC1 can also bind genomic loci independently of H3K27me3, and thus could conceivably maintain a repressed state at the HOXC locus in the absence of H3K27me3 (76). To further investigate these ideas, it will be helpful to use SCEPTRE to interrogate polycomb domains at other genomic loci.
Lastly, there are certain factors that influence the way SCEPTRE profiles the epigenetic state of genes. As demonstrated in the example of the LINC-PINT region ( Figure  4 and Supplementary Figure S9), the 3D context of a region can influence its epigenetic profile. This 'crosstalk' from neighboring regions is most-likely due to the fact that genes with different epigenetic states may be found closer than the resolution of ExM provides at a 4× expansion factor (∼75 nm). If so, methods that achieve better resolution, such as iterative ExM (77) or the combination of ExM with structured illumination microscopy (78), would provide a greater distinction between epigenetic states of neighboring genes while using SCEPTRE. Another factor that may play a role in profiling are the genetic elements within a targeted region. The labeled LINC-PINT region was an internal sequence of a gene, which showed a different distribution of H3K4me3 signals compared to the MYL6 region, whose promoter was found at the center of the labeled region. Therefore, considering the 3Dcontext of chromatin within cells (seen by Hi-C from a previous study) (54) and the epigenetic landscape of a genomic sequence (seen by CUT&RUN in this study) can help with either selecting each targeted region for SCEP-TRE, or in determining the epigenetic state for each region within a cell. Further improvements in multiplexing and resolution would allow SCEPTRE to systematically profile chromatin states in the genome, providing new insights into our understanding of genome structure and function.

DATA AVAILABILITY
CUT&RUN sequencing data for H3K4me3, H3K27me3 and H3K27ac were submitted to the NCBI gene expression omnibus (http://www.ncbi.nlm.nih.gov/geo/) and are available under the accession number GSE160784. These results, which were obtained using ENCODEvalidated antibodies, can be viewed using the following UCSC genome browser session: https://genome.ucsc. edu/s/marcwood/SCEPTRE RPE1 CnR hg38. For FISHlabeled regions, CUT&RUN results were confirmed by comparison to previously published ChIP-seq results (79,80).
MATLAB scripts for image processing and analysis are available at GitHub (https://github.com/marcwood13/ SCEPTRE pipeline). Additional data related to this paper will be made available by the corresponding authors upon reasonable request.

NOTE ADDED IN PROOF
A recently published paper combined immunofluorescence and DNA FISH to visualize multiple histone modifications across many genomic regions using sequential hybridization (81), though imaging was performed using conventional light microscopy that limited their analysis to chromosomal scale domains. This sequential hybridization approach developed can potentially be combined with SCEPTRE to allow for finer-scale profiling of chromatin states across a large number of gene loci.