Turning single cells into microarrays by super-resolution barcoding

In this review, we discuss a strategy to bring genomics and proteomics into single cells by super-resolution micros-copy.The basis for this new approach are the following: given the 10nm resolution of a super-resolution microscope and a typical cell with a size of (10 m m) 3 , individual cells contain effectively10 9 super-resolution pixels or bits of information. Most eukaryotic cells have 10 4 genes and cellular abundances of 10^100 copies per transcript. Thus, under a super-resolution microscope, an individual cell has 1000 times more pixel volume or information capacities than is needed to encode all transcripts within that cell. Individual species of mRNA can be uniquely identified by labeling them each with a distinct combination of fluorophores by fluorescence in situ hybridization. With at least 15 fluorophores available in super-resolution, hundreds of genes in can be barcoded with a three-color barcode ( 3 C 15 ¼ 455). These calculations suggest that by combining super-resolution microscopy and barcode labeling, single cells can be turned into informatics platforms denser than microarrays and that molecular species in individual cells can be profiled in a massively parallel fashion.


INTRODUCTION
There had been two transformative technologies in modern systems biology: genomics, which allows all of genes and proteins in an organism to be monitored simultaneously, and single-cell biology, which follows a few specific genes in individual cells with high precision in their native micro-environments. Both techniques are powerful. The genomics technology employing microarrays [1] and next-generation sequencing [2][3][4][5] can probe mRNA abundances and DNA-protein interactions across the entire genome. At the same time, the singlecell approach with fluorescent proteins [6][7][8][9][10][11] and fluorescence in situ hybridization (FISH) [12,13] are highly quantitative and can detect mRNAs and proteins in cells with single-molecule accuracy [14][15][16]. However, both approaches have complementary limitations: genomics averages over the heterogeneity and spatial complexity of a cell population, and single-cell techniques can only probe a few genes at a time. Integrating genomics with single cell is the next major challenge in biology.
There have been significant efforts in scaling down high-throughput techniques down to the single-cell level. However, the main challenge is that single cells contain a small amount of material that can be analyzed. For example, nucleic acid contents of single cells need to be amplified in order to be sequenced. However, amplification may introduce biases and distorts the quantitation of molecular species in single cell. Digital PCR [17,18] partially resolves this problem by spatially separating single molecules of cDNA converted from mRNA molecules into distinct wells and using the number of wells that light up to readout the copy number of mRNAs in the sample. Generalizations of this idea have been recently implemented [19][20][21][22][23] to improve the quantitation of DNA and RNA-seq, by ligating random barcodes to the cDNAs prior to amplification as a way of digitalizing quantification of sequencing reads. This method may allow more quantitative RNA-seq from single cells. However, single cells still need to be isolated and extracted from tissues removing the intracellular and intercellular location of the RNAs.

MOTIVATION
Spatial separation underlies the basis of many biochemical and analytical techniques. Gel electrophoresis and affinity columns are routinely used to separate molecules based on their physical properties as well as their binding affinities. Microarray generalizes this in a high-throughput fashion compared to northern blots by spotting different oligonucleotides complementary to different genes on a dense spatial array. Spatial separation can also trade data space for improved accuracy of quantitation, as discussed previously with digital PCR and sequencing.
Resolving molecules natively in individual cells without separation becomes possible with the advent of super-resolution microscopy such as PALM [24], STORM [25], FPALM [26], SSIM [27] and STED [28], as many cellular components can be resolved down to nanometer accuracy. This boon in resolution has made significant impact in cell biology. We propose that super-resolution microscopy also hold high potential for single-cell systems biology: many molecular species can be inherently spatially separated within individual cells. With a typical cell of (10 mm) 3 , a 3D-STORM microscope with a lateral resolution of 15 nm and an axial resolution of 50 nm can in principle resolve 10 8 such pixels in a cell. In comparison, there are only on the order of 10 6 mRNA molecules per cell [3,4]. Thus, many messenger RNAs can be spatially resolved and an individual cell can, in essence, serve as a microarray under a super-resolution microscope ( Figure 1).
While super-resolution microscope provides the optical space to resolve a large number of molecules in cells, each molecular species still need to be specifically labeled and uniquely identified. Pioneering work in single-molecule FISH (smFISH) by Singer [12] and Raj [13] using short synthetic oligonucleotide have shown that transcriptional active sites and single mRNAs in cells can be detected with high specificity and accuracy. This smFISH technology has been used to multiplex chromosomal loci and transcription active sites by barcoding with a combination of fluorophores [29][30][31]. We can borrow this approach to labeling single mRNAs. In the STORM version of super-resolution microscopy, fluorophores are constructed from pairs of organic dyes in an activator and emitter configuration, giving rise to at least nine distinct colors [32]. With this large palette, it can be straightforward to scale up the multiplexing capacity. An alternative to the spectral barcoding used for chromosome labeling involves resolving the spatial order of the barcode on the mRNA in super-resolution. Both spectral and spatial schemes have been demonstrated [33]. The relative advantages and disadvantages of the spatial versus spectral barcoding schemes are that spatial barcoding is more efficient to scale up while spectral coding can be more robustly readout [33].

METHOD
Our work flow for multiplex mRNA detection in single cells is as follows. First, we design probes that will hybridize against each gene specifically. The probes are designed as pairs: one labeled at the 5 0 -end and the other at the 3 0 -end, such that when the probe pairs are hybridized on the mRNA, they will bring two dye molecules, an activator and an emitter in close proximity to form a function dye pair for STORM. Second, we will barcode label the probe sets so that each gene in the multiplex is assigned with a unique barcode imparted by the probe. Third, cells are hybridized with this probe set and imaged on a super-resolution microscope ( Figure 2). Since the STORM imaging routine involves switching off all of the emitters initially and reactivating them by exciting their neighboring activators, the labeling scheme with pairs of probes allows additional labeling specificity and background rejection. The probes pairs hybridized to target mRNAs can be reactivated, while the probes nonspecifically bound to the cell are unlikely to form the dye pairs to be reactivated. Finally, after the imaging, the amount of each barcode detected in individual cells can be counted and used to quantify the abundance of the corresponding mRNAs.
We have recently demonstrated a 32 gene multiplex in single yeast cells, using the spectral superresolution barcoding technique. To accomplish this multiplex, we used three color barcodes with seven fluorophores ( 3 C 7 ¼ 35). Three of the barcodes were left intentionally empty to measure the false positive rates. These empty positions were detected at 0.67 AE 0.84 copies per cell, indicating that if more than one copies of a particular mRNA were detected in a cell, they would be confidently measured above the background noise. We performed quantitative comparisons against qPCR and smFISH to show that the super-resolution barcode FISH matches with qPCR and smFISH with a correlation coefficient of 0.95 in both cases. In addition, we scrambled the barcode order to show that the quantitation is not barcode assignment specific [33]. We used this multiplex to show that gene expression is heterogeneous at the regulon level in individual cells.

OUTLOOK
To further scale up this approach, more fluorophores can be added to the base palette for barcoding. Recent work [34] has shown that there are most likely 15 dye pairs available commercially. A twocolor barcode scheme can code for 2 C 15 ¼ 105 genes and a three-color barcode scheme can allow a 3 C 15 ¼ 455 gene multiplex. This level of throughput makes this technique a powerful follow-up to RNA-seq experiments with a select group of target genes imaged in cells in their native microenvironments. In addition, considering that most RNAi screens are performed with 100-500 genes, this level of throughput will allow de novo discovery of genes with the super-resolution barcoding approach by identify the transcript whose expression correlates the most closely to the phenotype.
The ability to visualize mRNAs in situ is powerful for investigating cell-cell interactions in a range of systems from microbial communities to embryos. However, current STORM imaging is confined to thin optical samples. To bring this technology into optically thick samples, light sheet microscopy or Selective Plan Illumination Microscopy (SPIM) [35] is needed to efficiently illuminate the sample and avoid photobleaching. Recent work has shown that combining super-resolution microscopy with SPIM is possible [36]. Further work in applying SPIM to super-resolution barcoding will allow this technique to reach its full potential for single-cell system biology in complex samples involving heterogeneous population of cells.
Super-resolution barcoding can be used beyond multiplex quantitation of mRNA abundance. In principle, any molecule that can be labeled with a highaffinity multi-fluorophore tag can be multiplexed as long as barcode density does not exceed optical resolution. It can be employed to detect transcript diversity in alternative splicing variants as well as to image chromosomal conformation with multiplex labeling of DNA. Recent work has shown the detection of two isoforms of an alternatively spliced mRNA [37,38]. The super-resolution barcoding can be used to detect more complex splicing patterns, possibly in application to self-avoidance of neurons in flies with DSCAM [39] and proto-cadherins in vertebrates [40]. In addition, we have recently demonstrated efficient DNA FISH for labeling chromosomes with short DNA oligo probes. Multiplexing of DNA FISH will allow us to visualize the native chromosomal structure directly in single cell. Simultaneous detection of transcription factors or chromatin-modifying proteins with super-resolution imaging on top of the multiplex DNA FISH barcodes serving as landmarks will allow the researchers to perform single-cell versions of chromatin immunoprecipitation experiments. Finally, proteins and other molecular species may be multiplexed in single cells using antibodies or synthetic aptmers [41,42] with super-resolution barcoding. The vast imaging space available from a single cell under a super-resolution microscope put forward the possibility that single-cell versions of many types of microarray experiments can be implemented with super-resolution barcoding.

CONCLUSION
This technique has the potential transcriptional profile mRNAs with single-molecule sensitivity, generate a high-resolution physical map of chromosome and map protein-DNA interactions, all within individual cells. The preliminary data on resolving barcodes on mRNAs labeled by FISH with superresolution microscopy demonstrate the feasibility of the super-resolution barcoding approach. This approach offers several distinct advantages: the detection method is high-throughput and single-molecule sensitive, requires only single cells as starting material and the in situ labeling preserves the spatial context of molecules in cells as well as cellular contacts in tissues. By mapping regulatory networks within cells and signaling interactions among cells in tissues, we are open to a wide range of biological problems in development and neurobiology in which a individual cells embedded in tissues interact to generate patterns and determine cell fates. The single-cell techniques discussed here have the broad impact from fundamental scientific problems in regulatory networks to medical applications in mapping out the molecular origin of diseases such as cancer and autism, and hold promise as the next-generation diagnostic tools in clinical settings.

Key points
Super-resolution microscopy allows individual cells to be resolved into millions to billions of pixels, turning individual cells into virtual microarrays. Many molecular species can then be barcode labeled and detected simultaneously in single cells with super-resolution imaging. We demonstrated using FISH barcode labeling of mRNA of 32 gene multiplex in single yeast cells.