Abstract

STUDY HYPOTHESIS

Does primordial germ cell (PGC) activation start before mouse embryo implantation, and does the possible regulation of the DNA (cytosine-5-)-methyltransferase 3-like (Dnmt3l) by transcription factor AP-2, gamma (TCFAP2C) have a role in this activation and in the primitive endoderm (PE)–epiblast (EPI) lineage specification?

STUDY FINDING

A burst of expression of PGC markers, such as Dppa3/Stella, Ifitm2/Fragilis, Fkbp6 and Prdm4, is observed from embryonic day (E) 3.25, and some of them, together with the late germ cell markers Zp3, Mcf2 and Morc1, become restricted to the EPI subpopulation at E4.5, while the dynamics analysis of the PE–EPI transitions in the single-cell data suggests that TCFAP2C transitorily represses Dnmt3l in EPI cells at E3.5 and such repression is withdrawn with reactivation of Dnmt3l expression in PE and EPI cells at E4.5.

WHAT IS KNOWN ALREADY

In the mouse preimplantation embryo, cells with the same phenotype take different fates based on the orchestration between topological clues (cell polarity, positional history and division orientation) and gene regulatory rules (at transcriptomics and epigenomics level), prompting the proposal of positional, stochastic and combined models explaining the specification mechanism. PGC specification starts at E6.0–6.5 post-implantation. In view of the important role of DNA methylation in developmental events, the cross-talk between some transcription factors and DNA methyltransferases is of particular relevance. TCFAP2C has a CpG DNA methylation motif that is not methylated in pluripotent cells and that could potentially bind on DNMT3L, the stimulatory DNA methyltransferase co-factor that assists in the process of de novo DNA methylation. Chromatin-immunoprecipitation analysis has demonstrated that Dnmt3l is indeed a target of TCFAP2C.

 STUDY DESIGN, SAMPLES/MATERIALS, METHODS

We aimed to assess the timing of early preimplantation events and to understand better the segregation of the inner cell mass (ICM) into PE and EPI. We designed a single-cell transcriptomics dynamics computational study to identify markers of the PE–EPI bifurcation in ICM cells through searching for statistically significant (using the Student's t-test method) differently expressed genes (DEGs) between PE and EPI cells from E3.5 to E4.5. The DEGs common for E3.5 and E4.5 were used as the markers defining the steady states. We collected microarray and next-generation sequencing transcriptomics data from public databases from bulk populations and single cells from mice at E3.25, E3.5 and E4.5. The results are based on three independent single-cell transcriptomics data sets, with a fold change of 3 and P-value <0.01 for the DEG selection.

MAIN RESULTS AND THE ROLE OF CHANCE

The dynamics analysis revealed new transitory E3.5 and steady PE and EPI markers. Among the transitory E3.5 PE markers (Dnmt3l, Dusp4, Cpne8, Akap13, Dcaf12l1, Aaed1, B4galt6, BC100530, Rnpc3, Tfpi, Lgalsl, Ckap4 and Fbxl20), several (Dusp4, Akap13, Cpn8, Dcaf12l1 and Tfpi) are related to the extracellular regulated kinase pathway. We also identified new transitory E3.5 EPI markers (Sgk1, Mal, Ubxn2a, Atg16l2, Gm13102, Tcfap2c, Hexb, Slc1a1, Svip, Liph and Mier3), six new stable PE markers (Sdc4, Cpn1, Dkk1, Havcr1, F2r/Par1 and Slc7a6os) as well as three new stable EPI markers (Zp3, Mcf2 and Hexb), which are known to be late stage germ cell markers. We found that mouse PGC marker activation starts at least at E3.25 preimplantation. The transcriptomics dynamics analyses support the regulation of Dnmt3l expression by TCFAP2C.

 LIMITATIONS, REASONS FOR CAUTION

Since the regulation of Dnmt3l by TCFAP2C is based on computational prediction of DNA methylation motifs, Chip-Seq and transcriptomics data, functional studies are required to validate this result.

 WIDER IMPLICATIONS OF THE FINDINGS

We identified a collection of previously undescribed E3.5-specific PE and EPI markers, and new steady PE and EPI markers. Identification of these genes, many of which encode cell membrane proteins, will facilitate the isolation and characterization of early PE and EPI populations. Since it is so well established in the literature that mouse PGC specification is a post-implantation event, it was surprising for us to see activation of PGC markers as early as E3.25 preimplantation, and identify the newly found steady EPI markers as late germ cell markers. The discovery of such early activation of PGC markers has important implications in the derivation of germ cells from pluripotent cells (embryonic stem cells or induced pluripotent stem cells), since the initial stages of such derivation resemble early development. The early activation of PGC markers points out the difficulty of separating PGC cells from pluripotent populations. Collectively, our results suggest that the combining of the precision of single-cell omics data with dynamic analysis of time-series data can establish the timing of some developmental stages as earlier than previously thought.

 LARGE-SCALE DATA

Not applicable.

STUDY FUNDING AND COMPETING INTEREST(S)

This work was supported by grants DFG15/14 and DFG15/020 from Diputación Foral de Gipuzkoa (Spain), and grant II14/00016 from I + D + I National Plan 2013–2016 (Spain) and FEDER funds. The authors declare no conflict of interest.

Introduction

Single-cell genomics have emerged as powerful tools to study the biology of single cells on a genome-wide scale connecting genotype to phenotype events at cellular (Tang et al., 2011; Review; Shapiro et al., 2013; Review; Saliba et al., 2014; Review; Nakamura et al., 2015) and subcellular level (Grindberg et al., 2013). One of the important applications of these tools is to obtain a detailed picture of the early stages of embryonic development based on transcriptional characterization of single cells. ‘Bulk’ studies of embryogenesis have provided invaluable insight into transcription-factor networks of development and differentiation (Maekawa et al., 2007; Giritharan et al., 2010; Choi et al., 2012). However, when we analyze a pool of cells, as opposed to single cells, we obtain average values of expression and lose information about variability. Single-cell transcriptomics analysis allows us to obtain additional information when studying the variability in a cell population, such as which genes have uniform expression in the whole population, which have high variability across the population and how this variability is changing dynamically across time, disclosing different bifurcations and stable points.

Mouse development is a highly regulated process, in which the totipotency of each blastomere is gradually lost as embryonic and extraembryonic lineages become specified (Schrode et al., 2013, Review). The three distinct cell lineages in the mature blastocyst are the extraembryonic trophectoderm (TE), the primitive endoderm (PE), and the embryonic epiblast (EPI). The specification of these three cell types is achieved through two ‘cell fate decisions’ (Bedzhov et al., 2014, Review). The first cell fate decision involves differentiation of the TE that gives rise to the placenta, and formation of the pluripotent inner cell mass (ICM), which during the second cell fate decision in its own turn is segregated into the PE, that gives rise to the yolk sac, and the pluripotent EPI, which generates the cells of the future body. Extensive expression heterogeneities among cells precede the emergence of the PE and EPI lineages. At the morula stage Embryonic day (E)2.5, Nanog and Gata6 are expressed by all ICM cells (Chazaud et al., 2006), whereas at E3.5, the ICM cells express PE and EPI markers, (Gata6, Sox17, Sox7, Gata4) and (Nanog, Sox2) (Schrode et al., 2013, Review), respectively, in a mutually exclusive, ‘salt-and-pepper’ way, thus identifying the precursors of each of the two lineages (Chazaud et al., 2006). Subsequent sorting of the two progenitors leads to the formation of morphologically distinct PE and EPI layers at E4.5.

Different models tried to explain the mechanism of specification of the EPI and PE progenitors. Morris et al. (2010) proposed that the timing of cell internalization, cell position and cell sorting combine to determine the distinct lineages of the preimplantation embryo, whereas according to Yamanaka et al. (2010), stochastic and progressive specification of EPI and PE lineages occurs during maturation of the blastocyst. In support of this stochasticity, single ICM cell transcriptomics studies have shown that there is intrinsic variation in the levels of expression of Fgf4 and Fgfr2 among the ICM cells (Kurimoto et al., 2006). However, the exact timing of specification and the mechanism of the formation of the three distinct cell lineages in the mature blastocyst still have to be explained.

Through the development of a computational tool for the discovery of DNA methylation motifs and its application to the analysis of pluripotent cells, we discovered that the transcription factor AP-2, gamma, TCFAP2C/TFAP2C, has a CpG DNA methylation motif that is not methylated in pluripotent cells and that could potentially bind on FGF4 (Luu et al., 2013) and on DNMT3L,the stimulatory DNA methyltransferase co-factor that assists the process of de novo DNA methylation (Gowher et al., 2005). TCFAP2C co-operates with CDX2 to override the pluripotency program and establish the extraembryonic trophoblast maintenance program in murine embryos (Kuckenberg et al., 2010). A subsequent chromatin-immunoprecipitation analysis demonstrated that Dnmt3l—together with Fgf4, Dnmt3b, Sfrp1, Dmrt1, Nanos3, c-Kit, Cdk6, Cdkn1a, Klf4—is indeed a target of TCFAP2C (Schemmer et al., 2013). Here, using transcription dynamics analysis, we aim to study whether such a target role has also a transcriptional readout, and whether and when a possible interplay between Dnmt3l and Tcfap2c has a role in lineage specification in the mouse preimplantation embryo. We extended the dynamics analysis to find out an appropriate timing for the lineage specification and possible mechanisms behind such specification, and to determine the transcripts associated with the PE and EPI transitions at E3.5. Additionally, we performed static analysis to obtain PE and EPI markers from the single-cell data. Although events leading to primordial germ cell (PGC) activation are not discussed in the literature before implantation, our dynamics transcriptomics analysis allowed us to discover late germ cell markers and PGC markers among the genes specific for EPI cells, and prompted the study of the transcriptomics dynamics of the PGC markers before implantation.

Materials and Methods

We collected data from the Affymetrix Mouse Genome 430 2.0 Array platform from bulk and single-cell analyses, and single-cell data from Illumina HiSeq 2000 RNA-Seq from the GEO and ArrayExpress databases. The annotation records of the collected data are presented in Table I. The Affymetrix transcriptomics data were normalized with the Robust Multi-array Analysis (RMA) (Irizarry et al., 2013) using the BioConductor software package (www.bioconductor.org). For the RNA-Seq transcriptomics data, we equalized the RPKM measurements performing the log2 transform, and normalized the data using the quantile normalization method.

Table I

Data sets used in the different transcriptomics analyses.

Sample stage Number of samples Mouse strain Database Database ID Data platform Type of analysis Reference 
E3.5 ICR GEO GSE7309 GPL1261 Bulk Maekawa et al. (2007) 
E3.5 CF1 GEO GSE23009 GPL1261 Bulk Giritharan et al. (2010) 
E3.5, IVF embryo Unknown GEO GSE41358 GPL1261 Bulk Cao et al. (2013) 
E4.5 B6D2F1 GEO GSE41925 GPL1261 Bulk Choi et al. (2012) 
E3.25, ICM
E3.5, ICM
E4.5, ICM 
36
22
BL/6xC3H Array Express E-MTAB-1681 GPL1261 Single cell Ohnishi et al. (2014) 
E3.5, ICM 20 C57BL/6 GEO GSE4309 GPL1261 Single cell Kurimoto et al. (2006) 
16-cell stage
Early blastocyst
Middle blastocyst
Late blastocyst 
50
43
60
30 
CAST/EiJ(mother) × C57BL/6J(father) GEO GSE45719 GPL13112 Single cell Deng et al. (2014) 
Sample stage Number of samples Mouse strain Database Database ID Data platform Type of analysis Reference 
E3.5 ICR GEO GSE7309 GPL1261 Bulk Maekawa et al. (2007) 
E3.5 CF1 GEO GSE23009 GPL1261 Bulk Giritharan et al. (2010) 
E3.5, IVF embryo Unknown GEO GSE41358 GPL1261 Bulk Cao et al. (2013) 
E4.5 B6D2F1 GEO GSE41925 GPL1261 Bulk Choi et al. (2012) 
E3.25, ICM
E3.5, ICM
E4.5, ICM 
36
22
BL/6xC3H Array Express E-MTAB-1681 GPL1261 Single cell Ohnishi et al. (2014) 
E3.5, ICM 20 C57BL/6 GEO GSE4309 GPL1261 Single cell Kurimoto et al. (2006) 
16-cell stage
Early blastocyst
Middle blastocyst
Late blastocyst 
50
43
60
30 
CAST/EiJ(mother) × C57BL/6J(father) GEO GSE45719 GPL13112 Single cell Deng et al. (2014) 

Note: GPL1261 corresponds to Affymetrix Mouse Genome 430 2.0 Array, and GPL13112 to Illumina HiSeq2000 NGS transcriptomics.

E, embryonic day; ICM, inner cell mass; GEO, Gene Expression Omnibus.

Selection of statistically significant differently expressed genes

For each selected two subgroups of samples, we calculated the mean of each probe across all the samples of the subgroup. Then we filtered out all the probes whose absolute value of difference of mean values between the two subgroups was less than a selection threshold θDEG [that corresponds to a fold change of log2(θDEG) on log2 scale]. To address the statistical significance of the differently expressed genes (DEGs), in the case of bulk versus single-cell comparison, to deal with the heteroscedasticity between bulk and single-cell samples, we used the Welch's unequal variances test applying the Welch–Satterthwaite equation to approximate the effective degrees of freedom of the pooled variance. In the intra single-cell comparison, we applied the Student's t-test. In both bulk versus single cells, and intra single-cell comparisons, the multitest effect influence was tackled through controlling the false-discovery rate using the Benjamini–Hochberg method to correct the initial P-values. We used θDEG = 4 for the bulk versus single-cell comparison, and θDEG = 8 for the intra single-cell comparisons. We used a more stringent significance threshold αDEG = 0.001 for the bulk and single-cell comparison, and αDEG = 0.01 for the intra single-cell comparisons.

Gene ontology statistically significant enrichment analysis

The gene ontology (GO) terms were taken from the curated collection of molecular signatures (gene set collection C5) of version 3.0 of the Molecular Signatures Database (MSigDB) (Subramanian et al., 2005). The significance of DEGs was analyzed using an enrichment approach based on the hypergeometric distribution. The significance (P-value) of the gene set enrichment was calculated using the hypergeometric distribution. The multitest effect influence was corrected through controlling the false-discovery rate using the Benjamini–Hochberg correction at a significance level αGO = 0.01.

Selection of ICM single cells from the RNA-Seq data

The middle and late blastocysts single-cell RNA-Seq data were taken from Deng et al. (2014). We equalized the RPKM measurements performing the log2 transform, and normalized the data using the quantile normalization method. We selected as ICM samples those with an expression level of Cdx2 <0.1 and expression level of Sox2 >5.

The hierarchical clustering of genes and samples was performed with one minus correlation metric and the unweighted average distance (UPGMA), also known as group average, linkage method. All the software and graphics were developed using in-house implemented functions in Matlab.

Results

Bulk and single cells from mouse early developmental stages have different transcriptomics profiles

The bidimensional principal component analysis (PCA) of transcriptomics data of bulk cells from E3.5 and E4.5, and single cells from E3.25, E3.5 and E4.5 (data from Ohnishi et al., 2014), in Fig. 1A, reveals that the bulk samples are clearly separated from the single-cell samples. This can be partly due to the batch effect caused by different laboratory experimental procedures. Interestingly, the bulk samples E3.5 IVF (Cao et al., 2013) and E3.5 ICR (Giritharan et al., 2010), though originating from different laboratories, cluster together tightly. The most informative principal component (PC) that explains 17% of the data variability, the first PC, separates the bulk samples in the negative first PC from the single-cell samples that have mainly positive first PC. The second PC, that explains 6.4% of the transcriptomics variability, reflects the developmental stage of the samples, where the age increases inversely to the increase in the second PC axis, both for bulk and for single cells. The difference between the bulk and single-cell transcriptomics profiles is also reflected in the hierarchical clustering (Fig. 1B), where the bulk samples cluster away from the single-cell samples. The heatmap performed with the most variable probes across all the samples shows three main modes of behavior (Fig. 1C): (i) a set of commonly high expressed probes in both bulk and single-cell samples (the full list is given in Supplementary data, Fig. S1A), (ii) a set of commonly low expressed probes in both bulk and single-cell samples (the full list is given in Supplementary data, Fig. S1B) and (iii) a set of probes which are commonly high expressed in the bulk samples, but have a highly variable expression in the single cells, and which are analyzed in the following section.

Figure 1

Bulk and single cells from mouse early developmental stages have different transcriptomics profiles. (A) Bidimensional PCA of transcriptomics data from bulk cells from embryonic day (E) 3.5 and E4.5, and single cells from E3.25, E3.5 and E4.5. The bulk cells are represented by red (for E3.5) and magenta (for E4.5) diamonds, and the single cells by green (E3.25), blue (PE) and cyan (EPI) spheres. Ellipses encircle the samples of the same embryonic day and data type, where ‘b’ and ‘sc’ stand for bulk and single-cell data, respectively. (B) Circular hierarchical clustering of the data. (C) Heatmap of the top highly variable probes across all the samples. The vertical blocks a, b and c mark the three main modes of transcription behavior.

Figure 1

Bulk and single cells from mouse early developmental stages have different transcriptomics profiles. (A) Bidimensional PCA of transcriptomics data from bulk cells from embryonic day (E) 3.5 and E4.5, and single cells from E3.25, E3.5 and E4.5. The bulk cells are represented by red (for E3.5) and magenta (for E4.5) diamonds, and the single cells by green (E3.25), blue (PE) and cyan (EPI) spheres. Ellipses encircle the samples of the same embryonic day and data type, where ‘b’ and ‘sc’ stand for bulk and single-cell data, respectively. (B) Circular hierarchical clustering of the data. (C) Heatmap of the top highly variable probes across all the samples. The vertical blocks a, b and c mark the three main modes of transcription behavior.

Fgf4 is one of the most variably expressed transcripts both in single cells and in bulk samples

The pairwise scatter plot between single cells and bulk samples at E3.5 shows (Fig. 2A), as expected from the global analysis performed in the previous section, a high number (3035) of DEGs (with fold change 2 in log2 scale) between them, and a Pearson's correlation coefficient ρ = 0.89. We checked for their statistical significance using the Welch's unequal variances t-test to deal with the heteroscedasticity between bulk and single-cell samples, and visualized the results using volcano plots. The results show that most of the DEGs are statistically significant, being more up-regulated in the E3.5 bulk cells than down-regulated in relation to the E3.5 single cells (Fig. 2C). An enrichment analysis of GOs of the DEGs up-regulated in E3.5 bulk cells showed that the most statistically significant (P-value < 0.01 using the hypergeometric test) terms are ‘RNA process’ and ‘Organelle’-related terms (Fig. 2E).

Figure 2

Main DEGs between bulk and single cells. Pairwise scatter plots of (A) E3.5 single versus E3.5 bulk. (B) E4.5 single versus E4.5 bulk samples. The black lines mark the boundaries of the 2-fold changes in the transcript expression level between the paired samples. Transcripts, which are up-regulated or down-regulated in ordinate samples compared with abscissa samples, are shown with red or green circles, respectively. The positions of some markers are shown as orange dots. The color bar to the right indicates the scattering density. Darker blue color corresponds to higher scattering density. The transcript expression levels are log2 scaled. Volcano plots of (C) E3.5 single cells versus E3.5 bulk. (D) E4.5 single cells versus E4.5 bulk. The red circles correspond to statistically significant DEGs up-regulated in the single-cell populations, and the green circles to DEGs up-regulated in the bulk samples. The positions of the same markers used in the scatter plots are shown as orange dots. Top 10 statistically significant GO terms enriched in the up-regulated DEGs: (E) E3.5 bulk versus E3.5 single cells, and (F) E4.5 bulk versus E4.5 single cells. (G) Violin plot of the distribution of the standard deviation of each probe across the different groups of samples. The red squares mark the mean values. Bar plots of the standard deviation (σ) of the expression of the top 40 highly variable transcripts across (H) bulk and (I) single-cell samples.

Figure 2

Main DEGs between bulk and single cells. Pairwise scatter plots of (A) E3.5 single versus E3.5 bulk. (B) E4.5 single versus E4.5 bulk samples. The black lines mark the boundaries of the 2-fold changes in the transcript expression level between the paired samples. Transcripts, which are up-regulated or down-regulated in ordinate samples compared with abscissa samples, are shown with red or green circles, respectively. The positions of some markers are shown as orange dots. The color bar to the right indicates the scattering density. Darker blue color corresponds to higher scattering density. The transcript expression levels are log2 scaled. Volcano plots of (C) E3.5 single cells versus E3.5 bulk. (D) E4.5 single cells versus E4.5 bulk. The red circles correspond to statistically significant DEGs up-regulated in the single-cell populations, and the green circles to DEGs up-regulated in the bulk samples. The positions of the same markers used in the scatter plots are shown as orange dots. Top 10 statistically significant GO terms enriched in the up-regulated DEGs: (E) E3.5 bulk versus E3.5 single cells, and (F) E4.5 bulk versus E4.5 single cells. (G) Violin plot of the distribution of the standard deviation of each probe across the different groups of samples. The red squares mark the mean values. Bar plots of the standard deviation (σ) of the expression of the top 40 highly variable transcripts across (H) bulk and (I) single-cell samples.

A parallel analysis performed with E4.5 data shows similar results to those obtained for E3.5 data, with slightly more DEGs (3445), and a Pearson's correlation coefficient ρ = 0.89 (Fig. 2B). There are still more statistically significant transcripts up-regulated in the E4.5 bulk samples than down-regulated in relation to the E4.5 single cells (Fig. 2D). Additionally, the GO enrichment analysis of the DEGs up-regulated in E4.5 bulk cells showed that among the most statistically significant GO terms are ‘Actin’ and again ‘Organelle’-related terms (Fig. 2F).

Some of the transcriptomics discrepancies between bulk samples and single cells could be due to a different transcriptomics variability captured by the bulk or single-cell experiments. To check such variability, we calculated for each microarray probe the variability, using the standard deviation, across all bulk replicates and all single cells at E3.5 and E4.5. We depicted the distributions of the corresponding transcriptomics variability using violin plots (Fig. 2G), which show that the transcripts of the E3.5 bulk samples have higher variability than the E4.5 bulk and all the single-cell samples. This is probably due to the longer distance between the E3.5 CF1 and E3.5 IVF samples revealed already by the PCA (Fig. 1A) and the hierarchical clustering (Fig. 1B). The E4.5 bulk samples have lower variability than any of the single cells, indicating, as is also shown in the PCA (Fig. 1A), that the single cells captured more information (that is, have more variability) than the bulk samples. Next, we checked for the genes that contribute more to the transcriptomics variability by depicting the standard deviation, σ, of the expression of the top 40 highly variable transcripts across the bulk (Fig. 2H) and single-cell samples (Fig. 2I). In the single-cell case, among the top highly variable genes are well-known EPI, Fgf4 and Spic; and PE, Sox17 and Cubn, markers (Fig. 2I). Interestingly, in the case of bulk samples (Fig. 2H), there appears also Fgf4, indicating that the signal of this EPI marker is so strong in early development that it can be silhouetted even over the noise of the background of the bulk samples.

Transcriptomics dynamics analysis of PE transitions reveals E3.5-specific PE markers related to the extracellular regulated kinase pathway

Bulk sample transcriptomics data lack enough information to study early developmental decisions. Therefore, we used the single-cell data from inner cells from E3.25 to E4.5 (Ohnishi et al., 2014), to understand the dynamics of the PE and EPI specification. We employed the change from E3.5 to E4.5 of DEGs between PE and EPI cells as a tool to observe the transition of the ICM cells to PE and EPI, and we used the common DEGs for E3.5 and E4.5 as the markers defining the steady states.

The pairwise scatter plots of the DEGs between E3.5 EPI and E3.5 PE (Fig. 3A), and between E4.5 EPI and E4.5 PE (Fig. 3B), show that the number of DEGs between EPI and PE increases with the progress of the embryonic development. There are 60 DEGs (with fold change 3 in log2 scale) between E3.5 EPI and E3.5 PE with Pearson's correlation coefficient ρ = 0.98, and 984 DEGs between E4.5 EPI and E4.5 PE with ρ = 0.91. Consequently, the number of the statistically significant DEGs increases with the time, as illustrated by the volcano plots in Fig. 3C and D, indicating how the PE and EPI lineages bifurcate during development.

Figure 3

Dynamics of PE and EPI markers revealed by single-cell transcriptomics analysis. Pairwise scatter plots of (A) E3.5 EPI versus E3.5 PE. (B) E4.5 EPI versus E4.5 PE. The black lines are the boundaries of the 2-fold changes in gene expression levels between the paired samples. Transcripts which are up-regulated in ordinate samples compared with abscissa samples, are shown with red circles; those down-regulated, are shown with green ones. The positions of some markers are shown as orange dots. The color bar to the right indicates the scattering density. Darker blue color corresponds to higher scattering density. The transcript expression levels are log2 scaled. Volcano plots of (D) E3.5 EPI versus E3.5 PE. (E) E4.5 EPI versus E4.5 PE. The red and green circles correspond to statistically significant DEGs up-regulated in the EPI and PE single cells, respectively. The positions of the same markers used in the scatter plots are shown as orange dots. (C) Euler–Venn diagram of the intersection of the E3.5 and E4.5 PE markers. (F) Euler–Venn diagram of the intersection of the E3.5 and E4.5 EPI markers. The lists of markers for each developmental stage are given in frames, red for E3.5-specific, yellow for the shared E3.5 ∩ E4.5 EPI and green for E4.5-specific markers, respectively. The full lists for E4.5 markers are given in Supplementary data, Table SI. The transcripts in the red and yellow frames that have not been previously described as related to PE or EPI decisions are highlighted in boldface. The PGC markers are written in red.

Figure 3

Dynamics of PE and EPI markers revealed by single-cell transcriptomics analysis. Pairwise scatter plots of (A) E3.5 EPI versus E3.5 PE. (B) E4.5 EPI versus E4.5 PE. The black lines are the boundaries of the 2-fold changes in gene expression levels between the paired samples. Transcripts which are up-regulated in ordinate samples compared with abscissa samples, are shown with red circles; those down-regulated, are shown with green ones. The positions of some markers are shown as orange dots. The color bar to the right indicates the scattering density. Darker blue color corresponds to higher scattering density. The transcript expression levels are log2 scaled. Volcano plots of (D) E3.5 EPI versus E3.5 PE. (E) E4.5 EPI versus E4.5 PE. The red and green circles correspond to statistically significant DEGs up-regulated in the EPI and PE single cells, respectively. The positions of the same markers used in the scatter plots are shown as orange dots. (C) Euler–Venn diagram of the intersection of the E3.5 and E4.5 PE markers. (F) Euler–Venn diagram of the intersection of the E3.5 and E4.5 EPI markers. The lists of markers for each developmental stage are given in frames, red for E3.5-specific, yellow for the shared E3.5 ∩ E4.5 EPI and green for E4.5-specific markers, respectively. The full lists for E4.5 markers are given in Supplementary data, Table SI. The transcripts in the red and yellow frames that have not been previously described as related to PE or EPI decisions are highlighted in boldface. The PGC markers are written in red.

The DEGs that are specific for E3.5 but are no longer DEGs at E4.5 could be considered as early transitory precursors of the PE and EPI specification that lose their role at E4.5. We named them E3.5-specific markers. We used two complementary types of graphs to depict the transitions of such markers. On one side, we made heatmaps to show the expression values across the single-cell samples of Ohnishi et al. (2014) from E3.5 to E4.5 (Fig. 4A and B). On the other side, to follow the dispersion dynamics of the expression values, we drew violin plots of the transcriptomics distributions across the different developmental stages (Fig. 5, Supplementary data, Figs S2 and S3).

Figure 4

PE and EPI markers specific for E3.5 single cells. Heatmaps of the E3.5-specific (A) PE and (B) EPI transcripts. The color bar in the top codifies the gene expression in log2 scale. Higher gene expression corresponds to a deeper red color. Correlation of the transcriptomics profiles from (C) E3.25 (Ohnishi et al., 2014), (D) E3.5 and E4.5 (Ohnishi et al., 2014), (E) E3.5 ICM (Kurimoto et al., 2006) and E4.5 (Ohnishi et al., 2014), (F) ICM from middle and late blastocyst (Deng et al., 2014), with the expression of the transcription factor AP-2, gamma, Tcfap2c (probe 1436392_at). Center panels of (C)–(F) represent the correlation of the expression of all genes with the expression of Tcfap2c, marking the position of the correlation of the DNA (cytosine-5-)-methyltransferase 3-like, Dnmt3l, and the PR domain containing 14, Prdm14. Left and right panels represent the top 50 high positively and negatively correlated genes, respectively. Green and red colors represent positively and negatively correlated genes, respectively.

Figure 4

PE and EPI markers specific for E3.5 single cells. Heatmaps of the E3.5-specific (A) PE and (B) EPI transcripts. The color bar in the top codifies the gene expression in log2 scale. Higher gene expression corresponds to a deeper red color. Correlation of the transcriptomics profiles from (C) E3.25 (Ohnishi et al., 2014), (D) E3.5 and E4.5 (Ohnishi et al., 2014), (E) E3.5 ICM (Kurimoto et al., 2006) and E4.5 (Ohnishi et al., 2014), (F) ICM from middle and late blastocyst (Deng et al., 2014), with the expression of the transcription factor AP-2, gamma, Tcfap2c (probe 1436392_at). Center panels of (C)–(F) represent the correlation of the expression of all genes with the expression of Tcfap2c, marking the position of the correlation of the DNA (cytosine-5-)-methyltransferase 3-like, Dnmt3l, and the PR domain containing 14, Prdm14. Left and right panels represent the top 50 high positively and negatively correlated genes, respectively. Green and red colors represent positively and negatively correlated genes, respectively.

Figure 5

Violin plots of the transcriptomics distributions of bulk and single-cell microarrays across different developmental stages of a selection of E3.5-specific markers. (A) E3.5-specific PE markers. (B) E3.5-specific EPI markers. The color codifies the distribution density. Red and blue color corresponds to high and low density, respectively. The mean and median values are shown as red crosses and green squares, respectively. Each individual expression value is represented by a black point. All the violin plots of the E3.5-specific PE and EPI markers are given in Supplementary data, Figs S2 and S3, respectively.

Figure 5

Violin plots of the transcriptomics distributions of bulk and single-cell microarrays across different developmental stages of a selection of E3.5-specific markers. (A) E3.5-specific PE markers. (B) E3.5-specific EPI markers. The color codifies the distribution density. Red and blue color corresponds to high and low density, respectively. The mean and median values are shown as red crosses and green squares, respectively. Each individual expression value is represented by a black point. All the violin plots of the E3.5-specific PE and EPI markers are given in Supplementary data, Figs S2 and S3, respectively.

Nineteen E3.5-specific PE transcripts are shown in the Euler–Venn diagram in Fig. 3E. Fgfr2 is already known as being involved in early PE specifications (Morris et al., 2013). It has a salt-and-pepper pattern of expression at E3.25, splitting the cells at E3.5 between PE and EPI, and attenuating such pattern at E4.5 (Figs 4A and 5A). The remaining transcripts not previously described as being involved in early PE specification are: the stimulatory of de novo DNA methylation co-factor Dnmt3l (Gowher et al., 2005), which because of its important interaction with other transcripts disclosed by the dynamics analysis, will be discussed in the next section; the dual specificity phosphatase 4, Dusp4, inactivator of extracellular regulated kinase (ERK)1/2 and the c-Jun N-terminal Kinase (JNK) (King et al., 1995) pathways, which after being expressed at low level in the early developmental stages, stabilizes at medium expression at E4.5 (Fig. 5A); the calcium-dependent membrane-binding protein COPINE 8, Cpne8 (Maitra et al., 2003), whose transcriptional dysregulation induced by LRRK2 is ERK-dependent (Reinhardt et al., 2013), and whose expression bifurcates very early and stabilizes, vanishing at E4.5 (Fig. 5A). Cpne8 behaves exactly like Dnmt3l (Fig. 4A); the A kinase (PRKA) anchor protein 13, Akap13, which enhances the cyclic AMP control of the ERK1/2 cascade (Smith et al., 2010). Akap13 is expressed at low level at E3.25, splits to high expression in PEs and low in EPIs at E3.5, and is highly expressed at E4.5 (Figs 4A and 5A); the 1110018J18Rik (Figs 4A and 5A), which denotes the AhpC/TSA antioxidant enzyme domain containing 1, Aaed1.

The DDB1 and CUL4-associated factor 12 like 1, Dcaf12l1, also known as WD repeat domain 40B:Wdr40B, is a germ line-related gene since it has been found to be expressed in the ovary with a specific functional requirement in meiotic entry, follicle assembly (Supplementary data, Table SI; Gallardo et al., 2007). Wdr40B is among the ERK-dependent positively correlated genes in mouse preimplantation embryos (Group 2, Supplementary data, Table SI; Maekawa et al., 2007). Dcaf12l1 is E3.5-specific, but it is at the boundary of being classified as a steady PE marker, since with the exception of one E4.5 PE single cell, all other E4.5 PE cells show high expression (Fig. 4A, Supplementary data, Fig. S2).

Among the E3.5-specific PE markers are two probes (Fig. 5A, Supplementary data, Fig. S3) of the UDP-Gal:betaGlcNAc beta 1,4-galactosyltransferase, polypeptide 6, B4galt6, a transmembrane protein and one of the seven beta-1,4-galactosyltransferase (beta4GalT) genes. Out of them, B4galt1 and B4galt5 have high expression across E3.5 and E4.5 for both PE and EPI cells, while the remaining four genes are expressed neither in PE nor in EPI single cells across this period.

The cDNA sequence BC100530 is highly expressed at E3.25, stays so in PEs but starts to reduce its expression in a salt-and-pepper pattern in EPIs at E3.5, and becomes low expressed in both PEs and EPIs at E4.5 (Figs 4A and 5A). The member of the minor U12-type spliceosome, Rnpc3, has a salt-and-pepper expression pattern at E3.25, medium expression in PEs and low in EPIs at E3.5, and again a salt-and-pepper expression pattern at E4.5 (Figs 4A and 5A).

The tissue factor pathway inhibitor, Tfpi, is known to interfere with endothelial cell migration by inhibition of both the ERK pathway and focal adhesion proteins (Provençal et al., 2008). Tfpi has three probes with very strong stabilization at E4.5 (Fig. 4A, Supplementary data, Fig. S2). The probes (1451790_a_at, 1451791_at) that target all 10 exons stabilize at high expression at E4.5. Their distinctive signals emerge also in the E4.5 bulk samples. Probe 1438530_at stabilizes at low expression at E4.5. The 1110067D22Rik gene corresponding to the lectin galactoside binding-like Lgalsl, has a salt-and-pepper expression pattern in E4.5 PE and EPIs (Figs 4A and 5A). The cytoskeleton-associated protein 4, Ckap4/P63, is expressed 2-fold more in E4.5 bulk samples than in E3.5 ones. For the E3.5 single cells, Ckap4 is PE-specific and preserves similar expression at E4.5, while in EPIs, its expression catches up with the PEs at E4.5, thus explaining the bulk expression behavior (Fig. 4A, Supplementary data, Fig. S2). Conversely, the expression of the F-box and leucine-rich repeat protein 20, Fbxl20, protein-ubiquitin ligase (Jin et al., 2004) is higher for E3.5 and lower for E4.5 bulk samples (Supplementary data, Fig. S2), while for the single cells, Fbxl20 is an E3.5-specific PE marker whose expression decreases considerably at E4.5 (Fig. 4A).

Transcriptomics dynamics analysis of EPI transitions reveals new E3.5-specific EPI markers

The number of the detected E3.5-specific EPI transcripts is similar to that of the E3.5-specific PE transcripts (Fig. 3F). Among the genes already known to be involved in early blastocyst development is the tripartite motif-containing 43A, Trim43a, described to be expressed specifically in mouse preimplantation embryos, and to peak at the 8-cell to morula stage; however, the three Trim43 genes (Trim43a, Trim43b and Trim43c) are not expressed in mouse embryonic stem cells (ESCs) (Stanghellini et al., 2009). Probes for Trim43b and Trim43c are not present in the used arrays. Trim43a follows the same pattern as Tcfap2c: highly expressed at E3.25, making a transition at E3.5 by splitting its expression between EPI and PEs, and stabilizing at E4.5 at low expression (Figs 4B and 5B).

The solute carrier family 2 (facilitated glucose transporter), member 3, Slc2a3, also known as Glut3 (placental glucose transporter), is a nutrient sensor during preimplantation development (Pantaleon et al., 2008). Selected with two probes as E3.5-specific, it has a salt-and-pepper expression pattern in PEs and high expression in EPIs at E3.5, and becomes highly expressed at E4.5 in both (Fig. 4B). Such expression increase at E4.5 is observed even in the bulk samples (Supplementary data, Fig. S3).

The PR domain containing 14, Prdm14, is a transcriptional regulator, specifically expressed in mouse preimplantation embryos, PGCs and pluripotent stem cells. Prdm14 acts through repressing its negative target genes by recruiting polycomb repressive complex 2 (PRC2) (Yamaji et al., 2013) and is associated with a unique expression profile in human PGC commitment in vitro (Sugawa et al., 2015). Its expression fluctuates at E3.25 and stabilizes at a low level in both PE and EPIs at E4.5 through splitting between EPIs (high expression) and PEs (low expression) at E3.5 (Figs 4B and 5B).

The Spi-C transcription factor (Spi-1/PU.1 related), Spic, was found to be specific to the 4-cell embryo stage (Hamatani et al., 2004). Here, we found that it has a salt-and-pepper expression pattern in PEs and high expression in EPIs at E3.5, and becomes low expressed in both PE and EPIs at E4.5 (Fig. 4B). Such expression decrease is observed even in the bulk samples (Supplementary data, Fig. S3).

Among the genes not previously described as involved in early EPI specifications, are: the serum/glucocorticoid-regulated kinase 1, Sgk1, that inhibits the ERK1/2 activity (Liu et al., 2014), and which at E3.25 is highly expressed, then discriminates between PEs and EPIs at E3.5, and stabilizes to high level of expression at E4.5 (Figs 4B and 5B); the integral membrane myelin and lymphocyte, Mal, that has low expression at all analyzed developmental stages except at E3.5, when it is highly expressed in the EPI cells (Figs 4B and 5B); the UBX domain protein 2A, Ubxn2a, that has a salt-and-pepper expression pattern across all early developmental stages except at E3.5, when with some exceptions, it tends to be high expressed in EPIs, and low expressed in PEs (Figs 4B and 5B); the autophagy-related 16-like 2, Atg16l2, which discriminates between EPI and PE cells at E3.5, and stabilizes at low expression level at E4.5, irrespectively of the type, PE or EPI (Figs 4B and 5B); the predicted gene Gm13102, which is high expressed in E3.25 cells, making a transition at E3.5 splitting its expression between EPI and PEs, and stabilizing at E4.5 at low expression (Figs 4B and 5B); the solute carrier family 2 (facilitated glucose transporter), member 3, Slc2a3; the hexosaminidase B, Hexb (Fig. 4B and Supplementary data, Fig. S3); the solute carrier family 1 (neuronal/epithelial high affinity glutamate transporter, system Xag), member 1, Slc1a1, which has a salt-and-pepper expression pattern in PE cells and high expression in EPIs at E3.5, and becomes low expressed at E4.5 in both PE and EPIs (Fig. 4B). Such an expression decrease is observed even in the bulk samples (Supplementary data, Fig. S3); the small VCP/p97-interacting protein, Svip, which is low expressed in PEs and with a salt-and-pepper expression pattern in EPIs at E3.5, and becomes low expressed in both PE and EPIs at E4.5 (Fig. 4B). Such expression decrease is observed even in the bulk samples (Supplementary data, Fig. S3); the lipase, member H, Liph, that encodes a membrane-bound member of the mammalian triglyceride lipase family, has a low expression at E3.25, which continues being low in PE cells, while becoming medium expressed in EPIs at E3.5, and vanishes at E4.5 (Fig. 4B, Supplementary data, Fig. S3); the mesoderm induction early response 1, family member 3, Mier3, which has a salt-and-pepper expression pattern during E3.25, that continues being such in PEs but becomes high expressed in EPIs at E3.5, and becomes low expressed in both PE and EPIs at E4.5. Such expression decrease is observed even in bulk cells (Fig. 4B, Supplementary data, Fig. S3); and Tcfap2c/Tfap2c, which, because of its important interaction with other transcripts disclosed by the dynamics analysis, will be discussed in the next section.

Transcriptomics dynamics analysis suggests a crosstalk between the E3.5-specific PE and EPI markers Dnmt3l and Tcfap2c for the PE–EPI cell fate specification

On one side, we found Tcfap2c to be among the 16 E3.5-specific EPI markers (Fig. 3F) revealed by the dynamics of the PE and EPI transitions. The violin plots of the transcriptomics distributions across the different developmental stages (Fig. 5B) show Tcfap2c as high expressed in E3.25 cells, making a transition at E3.5 by splitting its expression between EPI and PE cells, and stabilizing at E4.5 at very low expression (under the detection threshold). Tcfap2c has a transcriptional behavior (Fig. 4B) exactly opposite to the one shown by its target Dnmt3l (Fig. 4A), suggesting that TCFAP2C could exert its regulation over Dnmt3l by repressing Dnmt3l transcription. The bulk data transcriptomics mask such dynamics that have been revealed by the single-cell transcriptomics.

On another side, Dnmt3l is among the 17 E3.5-specific PE transcripts (Fig. 3E). Its expression across the different developmental stages fluctuates at E3.25, bifurcates to split the cells between EPI and PE at E3.5 (Fig. 5A), and stabilizes to a high level, in contrast to its binder gene Tcfap2c, at E4.5. Additionally, Fig. 4A shows a very stable high expression of Dnmt3l for all the E4.5 single cells, independent of them corresponding to the PE or EPI cell type. This behavior could mean that EPIs at E3.5 require transitionally less de novo DNA methylation than PEs, but that both EPIs and PEs need such methylation in the following developmental stage. To check this, we performed violin plots of the transcriptomics distributions across the different developmental stages for all the DNA methyltransferases across several developmental stages (Supplementary data, Fig. S4). DNMT3A is reported to be required for gametic methylation but is dispensable for the novo methylation in the early embryo, and conversely, DNMT3B is required for embryonic but not for gametic methylation (Okano et al., 1999; Kaneda et al., 2004, 2010; Borgel et al., 2010; Guenatri et al., 2013). The expression distribution of the DNA methyltransferases shows that two probes of Dnmt3a, 1423065_at and 1460324_at, have very low expression level in the earlier developmental stages (Supplementary data, Fig. S4). This is an observation consistent with the gamete-specific DNMT3A dispensability in the early embryo development. However, the Dnmt3a expression splits at E4.5 for low expression in EPI, and high expression in PE cells, indicating a previously not described potential role in the PE–EPI specification at E4.5. In concordance with that, Dnmt3a has passed the computational filters used to define the set of 505 E4.5-specific PE markers for fold change 2. On the other side, Dnmt3b, the de novo methylation embryo-specific methyltransferase (probe 1449052_a_at), has a high expression level across all the developmental stages, irrespective of whether the single cells are PE or EPI.

To quantify the dynamics of the interrelation between Tcfap2c and Dnmt3l, we calculated the correlation coefficients of the expression of all the genes in relation to Tcfap2c for developmental stages E3.25, E3.5 and E4.5 from Ohnishi et al. (2014), and we evaluated the rank of the correlation of Dnmt3l with Tcfap2c, in comparison with all other genes. At E3.25, the expression of Dnmt3l is not correlated with that of Tcfap2c (Fig. 4C). From E3.5 to E4.5, Dnmt3l is the top third negatively correlated gene with Tcfap2c (Fig. 4D). The top negatively correlated with Tcfap2c genes, Akr1b8/Fgfrp and Abcg2, did not fulfill the computational conditions be considered as PE markers. To confirm the strong anticorrelation of Tcfap2c and Dnmt3l from E3.5 to E4.5, we used two additional data sets. On one side, we used the E3.5 ICM single-cell data of Kurimoto et al. (2006) and, due to the lack of E4.5 ICM data there, we combined them with the E4.5 data of Ohnishi et al. (2014). We found that again Dnmt3l is strongly anticorrelated with Tcfap2c (Fig. 4E). On the other side, we used the RNA-Seq single-cell data from middle and late blastocysts of Deng et al. (2014). The ICM single cells were selected with an algorithm described in the Materials and Methods section. In this case, Dnmt3l is even more anticorrelated with Tcfap2c, being the 47th most anticorrelated transcript (Fig. 4F).

Steady PE markers revealed by single-cell transcriptomics analysis

In the previous sections, we analyzed the PE and EPI transitions to find clues about the main players and precursor events of the PE and EPI specification. However, for the later embryonic periods, it is more suitable to use lasting and robust PE and EPI markers. To obtain them, we used the common DEGs between single-cell PE and EPIs, shared by the E3.5 and E4.5 stages. Their lists are given inside the yellow frames in Fig. 3E and F, the heatmaps of their expression in all single cells in Supplementary data, Fig. S5, and the distributions of the expression of some of them in bulk populations and single cells in the violin plots in Supplementary data, Fig. S6. As expected, we found many transcripts that have already been described to be PE or EPI markers.

Among the already known PE markers, we found: the procollagen-proline, 2-oxoglutarate 4-dioxygenase (proline 4-hydroxylase), alpha II polypeptide, P4ha2; the cubilin (intrinsic factor-cobalamin receptor), Cubn; the SRY (sex-determining region Y)-box 17, Sox17; the serine (or cysteine) peptidase inhibitor, clade H, member 1, Serpinh1 also known as Hsp47;the forkhead box Q1, Foxq1; laminin B1, Lamb1; and thealdehyde dehydrogenase 18 family, member A1, Aldh18a1; all of them have already been validated by single-cell quantitative PCR (qPCR) measurements to be expressed in EPI at E3.5 and E4.5 (Ohnishi et al., 2014); the GATA binding protein 4, Gata4; the collagen, type IV, alpha 1, Col4a1, that together with perlecan (HSPG2) has been described to exhibit dynamic localization in the murine allantoic core domain (Mikedis and Downs, 2009); and Pdgfra, the platelet-derived growth factor receptor, alpha polypeptide, identified by Artus et al. (2010) as an early PE marker.

Among the genes that have not been previously described to be involved in early PE specification, we found the laminin, alpha 1, Lama1, the dickkopf homolog 1, Dkk1, and the hepatitis A virus cellular receptor 1, Havcr1. Lama1−/− embryos fail to develop Reichert's membrane and die by E7 (Miner et al., 2004). The function of another stable PE marker, Sox17, in the differentiation of ICM and ES cells is to bind the regulatory regions of many fellow steady PE marker genes, such as Col4a1, Lama1, Lamb1, that encode basement membrane components, and activate these genes (Niakan et al., 2010). DKK1 blocks the canonical Wingless-related integration site (Wnt) pathway and various embryonic patterning processes require a distinct DKK1-mediated Wnt inhibition (Lieven et al., 2010). Havcr1 belongs to type I cell-surface glycoproteins and has an important role in infectious diseases such as Hepatitis, Ebola and Dengue, but has never been described as having a role in early development.

We have chosen eight of the known steady markers (four for each population of PEs and EPIs) to delineate the dynamics of the PE and EPI specification from inner single cells, starting from E3.25 until E4.5 (Fig. 6A). At E3.25, the inner single-cell PE and EPI markers show a salt-and-pepper expression pattern, suggesting that at this stage, the PE–EPI fate decision has not been settled yet. At E3.5, the PE–EPI decision has already been well established, and the chosen markers segregate perfectly between PE and EPI cells, with the PE markers, corresponding to the upper semicircles in the pie-chart representation in Fig. 6A, being high expressed (red) in PEs and low expressed (green) in EPIs. The EPI markers corresponding to the lower semicircles show a reverse behavior. Such behavior is robust and reinforced in E4.5 cells, illustrating that the selected markers are stable.

Figure 6

Expression dynamics of PE, EPI and PGC markers. (A) Known and (B) new steady PE and EPI markers [based on data from Ohnishi et al. (2014)]. PGC markers, Tcfap2c and Dnmt3l expressed in the data from (C) Ohnishi et al. (2014), (D) Kurimoto et al. (2006) and (E) Deng et al. (2014). Each sketch represents the single cells at a certain number-of-cell stage and cellular type. The gene expression of each cell is represented by a small pie-chart with selected markers. The color bar to the left codifies the gene expression on a log2 scale. Higher gene expression corresponds to a deeper red color. The selected markers and their corresponding positions in the pie-chart are represented by the black and white pie-charts in the bottom. In the case of ICM data sets, the semicircle of gray circles surrounding the top part of each group of single cells represents the blastocyst outer cells. In the case of single cells from all cell types of the embryo, the cells are grouped according to their belonging to a respective embryo.

Figure 6

Expression dynamics of PE, EPI and PGC markers. (A) Known and (B) new steady PE and EPI markers [based on data from Ohnishi et al. (2014)]. PGC markers, Tcfap2c and Dnmt3l expressed in the data from (C) Ohnishi et al. (2014), (D) Kurimoto et al. (2006) and (E) Deng et al. (2014). Each sketch represents the single cells at a certain number-of-cell stage and cellular type. The gene expression of each cell is represented by a small pie-chart with selected markers. The color bar to the left codifies the gene expression on a log2 scale. Higher gene expression corresponds to a deeper red color. The selected markers and their corresponding positions in the pie-chart are represented by the black and white pie-charts in the bottom. In the case of ICM data sets, the semicircle of gray circles surrounding the top part of each group of single cells represents the blastocyst outer cells. In the case of single cells from all cell types of the embryo, the cells are grouped according to their belonging to a respective embryo.

The new steady EPI markers revealed by single-cell transcriptomics analysis are late stage germ cell markers

Among the EPI markers, we found Cldn4, the member of the claudin family, whose members are integral membrane proteins and components of tight junction strands. Cldn4 plays a critical role in maintaining cell polarity and signal transductions, and has been found together with Cldn6 to be essential for blastocyst formation in mouse preimplantation embryos (Moriwaki et al., 2007). Surprisingly, we found that another member of the 24 claudin family, Cldn7, is specifically expressed at E4.5 in PE single cells (Supplementary data, Table SI). Interestingly, it has been found that the E3.5-specific EPI Tcfap2c is a core regulator of tight junction biogenesis and cavity formation during mouse early embryogenesis, and that it is largely recruited to the Cldn4 promoter (Choi et al., 2012).

Other transcripts, already described as EPI markers, are: two transcripts of the fibroblast growth factor 4, Fgf4; the NANOG homeobox, Nanog, already validated by single-cell qPCR measurements to be expressed in EPI at E3.5 and E4.5 (Ohnishi et al., 2014); and the Kruppel-like factor 2, Klf2, which has been found to be phosphorylated by ERK2 to subsequently be degraded. MEK inhibition prevents KLF2 phosphodegradation to sustain pluripotency; thus, the MEK/ERK/KLF2 axis co-operates with the GSK3/TCF3/ESRRB pathway in mediating ground state pluripotency (Yeo et al., 2014).

Among the genes that have not been previously described to be involved in early EPI specification, are the microrchidia 1, Morc1, the msf.2 transforming sequence, Mcf2/Dbl, and the zona pellucida glycoprotein 3, Zp3. Interestingly, Zp3, here a steady EPI marker, was reported as undetectable (mRNA <1000 copies per embryo) at the morula and blastocyst stage (Roller et al., 1989). The zona pellucida is a unique, oocyte-specific matrix that coats the surface of all mammalian eggs. Composed of three sulfated glycoproteins in the mouse (ZP1, ZP2 and ZP3), the zona pellucida facilitates early events in fertilization and protects the embryo during preimplantation development (Lunsford et al., 1990). Additionally, ZP3 serves as a sperm receptor at fertilization. It is a late stage germ cell marker and mice homozygous for an insertional mutation in the Zp3 gene lack a zona pellucida and are infertile (Rankin et al., 1996).

Morc1 encodes a testis-specific protein. Male Morc1knockout mice are infertile but otherwise normal (Watson et al., 1998; Inoue et al., 1999). These studies further support a role for this protein early in spermatogenesis. MORC1 causes transposon repression in the male germline in a pattern similar to that of germ cells deficient for DNMT3L. Morc1 mutants show highly localized defects in the establishment of DNA methylation. MORC1 is an important regulator of the epigenetic landscape of male germ cells during global de novo methylation (Pastor et al., 2014). Mcf2 is specifically expressed in the nervous system and in gonads of both sexes, in maturing germ cells (Ron et al., 1988; Galland et al., 1991).

The new steady PE and EPI markers have rich transcriptomics dynamics (Fig. 6B). Contrary to the known steady markers, they have low expression at E3.25 and segregate the cells between PE and EPI at E3.5 and E4.5. Since the populations of PE and EPI cells progressively depart from each other at E4.5, we disclosed an indeed long list of 359 PE and 278 EPI E4.5-specific transcripts, with a short selection given in Fig. 3E and F, and a full list provided in Supplementary data, Table SI.

Significant PGC specification genes are highly expressed from E3.25

We found in the correlation analysis with the PGC specification transcription factor Tcfap2c, that another PGC specification transcription factor Prdm14 (Yamaji et al., 2008) is the top highly correlated with Tcfap2c gene from E3.5 to E4.5 (Fig. 4D). This is confirmed also by the E3.5 data of Kurimoto et al. (2006), where Prdm14 is also highly correlated with Tcfap2c (Fig. 4E). At the early E3.25 stage, Prdm14 shows anticorrelation with Tcfap2c (Fig. 4C). In the case of the RNA-Seq data of Deng et al. (2014), Prdm14 has insufficient counts across the different samples to calculate a reliable correlation.

Among the 278 E4.5-specific EPIs, we identified the early PGC markers Ifitm3/Fragilis (Saitou et al., 2002), Ifitm2/Fragilis3 (Lange et al., 2003), and the PGC specification cytokine Bmp4 (Lawson et al., 1999). Another important early PGC marker described by Saitou et al. (2002), Dppa3/Stella, is also high expressed in all the analyzed single cells from E3.25 to E4.5 with a slight down-regulation in PEs at E4.5 (Fig. 6C). Thus, although events toward PGC commitment during the preimplantation period are in principle not expected, we checked for the expression of other PGC markers. We used a panel of 11 mouse-specific PGCs that we have previously identified to facilitate the in vitro derivation of germ cells and gametes (Sabour et al., 2011) from pluripotent populations. Interestingly, we found that 4 out of the 11 PGC markers show preimplantation expression in the data from Ohnishi et al. (2014) (Supplementary data, Fig. S7). Fkbp6, FK506 binding protein, is significantly expressed across all the analyzed developmental stages. Gm1673 and Hba-a1, hemoglobin alpha, adult chain 1, are moderately expressed across all the analyzed developmental stages, and Hba-a1 has a burst of expression at the 34-cell stage at E3.25 (Supplementary data, Fig. S7). 4930432K21Rik is expressed strictly in EPI single cells at E4.5.

Even though the bulk samples have enough signal to detect Fgf4, PGC markers are not detectable in bulk samples (Fig. 2H). In contrast, the single-cell analysis detected a clear burst of expression of PGC markers ever since the embryo developed to 32 cells (E3.25), which continues until E4.5, when some of them (Ifitm3, Iftm2, Bmp4) become EPI specific (Fig. 6C and Supplementary data, Table SI). We checked whether the early activation of PGC markers can be observed in other single-cell transcriptomics data sets. The analysis of the E3.5 ICM transcriptomics microarray data of Kurimoto et al. (2006) strongly confirms our finding of early activation of PGC markers (Fig. 6D). Dppa3/Stella is highly expressed in each E3.5 ICM single cell, and the other PGC markers are high expressed in a high percentage of the cells, with two single cells expressing very highly almost all the selected PGC markers (Fig. 6D). The analysis of NGS transcriptomics data of whole embryos from 16-cells stage to late blastocysts from Deng et al. (2014) also shows that Dppa3/Stella is highly expressed in each single cell (Fig. 6E), and Ifitm2/Fragilis3 is also highly expressed in a high percentage of cells. In this data set, other PGC markers, such as Prdm14, Ifitm3 and Bmp4, have lower expression than in the other two data sets. A reason for such discrepancies could be that the Deng et al. (2014) data are NGS from all cell types of the embryo, whereas Ohnishi et al. (2014) and Kurimoto et al. (2006) data are microarray data from ICM only.

Discussion

As expected, the single-cell transcriptomics data differ in behavior from the bulk data, and provide much more information. In general, the single-cell transcriptomics reveal richer dynamical silhouetted patterns that remain masked behind the noise background in the bulk data. Higher heterogeneity of the bulk samples would be expected to produce higher noise background. However, we found in several cases that the signal of some probes in the bulk data is so strong that it stands out from the background noise. The most salient representative of this phenomenon is Fgf4, which was detected on the top of the most highly variable transcripts not only in single cells (Fig. 2I) but also among the highly variable transcripts of the bulk data (Fig. 2H), as well as among the commonly low expressed transcripts between bulk and single cells (Supplementary data, Fig. S1B). This phenomenon is also reflected to a different extent in the E3.5-specific PE transcripts Tfpi, Ckap4 and Fbxl20, as well as in the E3.5-specific EPI transcripts Slc2a3, Spic, Slc1a1, Svip and Mier3. In applications of bulk transcriptomics before the ‘single cell’ era, it was commonly assumed that the samples are homogeneous, or, in some experimental designs, it was assumed operationally or ‘expectedly’ that important probes have signals strong enough to be detected in transcriptomics experiments performed over heterogeneous bulk populations. The results obtained here illustrate that indeed such an assumption could produce some operational results. However, it is highly recommended to quantify the relation between the percentage of cells of interest in a heterogeneous population, and the minimal level of expression of the probes, necessary to provide minimal thresholds that delimit the minimal conditions under which the probe of interest could be expected to stand out over the background noise of the bulk data. To obtain such calibration in a controlled way, it could be very useful to have data of exactly the same biological cultures from the same laboratory, performed with the same protocol for bulk samples and for single cells.

With the exception of very strong signals that can stand out over the background noise, the mere existence of heterogeneities in a system justifies the use of single-cell analysis. Furthermore, the cellular heterogeneity is usually not a stationary property, more so in the case of in vivo systems. Thus, it is important to develop techniques that investigate the dynamics of the heterogeneity of cellular systems. In this sense, the approach of studying the dynamics of the precursor signals at the start of the PE–EPI specification by searching for E3.5-specific PE and EPI single-cell transcripts reveals not only important individual genes, but also a rich collection of different sets of transcriptomics trajectories followed by precursor transcripts. Some trajectories, such as the ones followed by the E3.5-specific PE single-cell transcript Dnmt3l, the two probes of Tfpi, Ckap4, and to a lesser extent B4galt6, show a trend to be highly expressed in the early developmental stages, with a pulse transitory down-regulation in EPI single cells at E3.5 that could be interpreted by the cells as a signal communicating that the early EPI program is not necessary any more. The converse trajectories are found for E3.5-specific EPI transcripts, as in the case of Sgk1. We found also cases of a salt-and-pepper expression pattern (Gm13102, Slc2a3, Trim43a) in one (EPI) or even the two (E3.5 and E4.5) developmental stages (1110067D22Rik, Akap13, Dusp4, Ubxn2a), meaning that specifically for E3.5, tight regulations of such transcripts are necessary to perform the signaling of the PE or EPI precursor events, but that such regulation is not necessary before and after E3.5.

The aforementioned masking effect of the bulk data on the detection of the weak signals has a correlate in the masking of the time determination of the events that provoke the signals. The cells, like any other physical system, produce continuous signal profiles. Thus, single cells emit weak signals marking the start of events, and those signals are continuously amplified as the event progresses. When bulk samples are used to study dynamic events, the start of an event is not detected until its associated signal does not acquire enough amplitude to make it distinguishable from the background. Conversely, when, in the same context, single-cell measurements are used, an event can be detected earlier, since even weak signals associated with the event are not masked by a background. Therefore, we foresee that the use of single-cell analysis in developmental biology will make a high impact on the redefinition of the developmental events, enabling us to date them earlier, thus, changing some of the timing of events previously determined by bulk-data analysis. As we discuss in the following paragraphs, such a ‘backward-dating’ effect is already observed in the analyzed data.

The crosstalk between Dnmt3l and Tcfap2c for the PE–EPI cell fate specification at E3.5, revealed by the transcriptomics dynamics analysis, provides insights into the orchestration of a cascade of events that link genetic and epigenetic regulation. DNA methylation occurs genome-wide during gametogenesis and early embryogenesis (Borgel et al., 2010), followed by a genome-wide erasure during the expression of pluripotency factors (Smith et al., 2012). However, Tcfap2c down-regulates Dnmt3l only in EPI cells at E3.5. After the vanishing of the Tcfap2c expression at E4.5, Dnmt3l becomes very highly expressed both in PE and EPI cells, whereas the de novo methylation embryo-specific Dnmt3b depicts high expression levels across all the analyzed developmental stages. The de novo methylation gamete-specific Dnmt3a has low level of expression in the earlier developmental stages, but at E4.5 splits its expression into low and high level in EPI and PE cells, respectively.

The knockout of zygotic Tcfap2c has a post-implantation lethality around E7.5 (Auman et al., 2002; Winger et al., 2006). Choi et al. (2012) depleted both maternally and zygotically derived TCFAP2C and found that Tcfap2c knockdown embryos underwent compaction and developed normally to the morula stage but only a small percentage of Tcfap2c knockdown morulae formed blastocysts, and even those that reached this stage were not able to fully expand and hatch after extended culture. As with other genes, Tcfap2c and Dnmt3l could have multiple functional roles at different developmental stages. The fact that the loss of Tcfap2c has a post-implantation lethality around E7.5 (Auman et al., 2002; Winger et al., 2006) does not rule out that some important functional roles, such as the tight junction assembly, fluid accumulation and cellular proliferation reported by Choi et al. (2012), could have already been activated before implantation and the disruption caused by the knockout produces subsequent lethality, at E7.5. Our results suggest that TCFAP2C could have an additional function of DNA methylation before implantation through regulation of Dnmt3l expression.

The function of DNMT3L is not related to the survival of the carriers but with the survival of their progeny since embryos can attain normal genomic methylation in the absence of any source of DNMT3L, either maternal or zygotic in origin (Guenatri et al., 2013), and DNMT3L is not required for the establishment of maternal methylation imprints in mouse (Bourc'his et al., 2001; Hata et al., 2002). However, offspring from Dnmt3l−/− female mice die between E9.5 (Bourc'his et al., 2001) and E10.5 (Hata et al., 2002) showing a lack of methylation, specifically at maternally methylated imprinting centers, and Dnmt3l−/− males are completely infertile (Bourc'his et al., 2001; Hata et al., 2002). These observations point out that DNMT3L could have a role in the DNA methylation establishment of the cells responsible for transmitting the genomic information to the next generation, the PGCs. Thus, the crosstalk between Tcfap2c and Dnmt3l in the preimplantation embryo could be related not to the survival of the embryo but with the possible PGC activation during preimplantation stages. Initially, early signals for PGC activation are not to be expected since the earlier detection of PGCs is at E6.0–6.5 from a founder population proximal epiblast adjacent to the extra-embryonic ectoderm (Lawson and Hage, 1994) becoming first identifiable as a cluster of ∼40 cells at the base of the incipient allantois around E7.25 (Chiquoine, 1956; Ginsburg et al., 1990; Saitou and Yamaji, 2012). However, surprisingly, we have found high expression of early PGC markers from as early as E3.25–E4.5. Among the highly expressed PGC markers in EPI single cells are Ifitm3/Fragilis, 4930432K21Rik, Dppa3/Stella, Fkbp6 and Prdm14, while Gm1673 and Hba-a1 are medium expressed. Interestingly, Fragilis and Stella are the genes used by Saitou et al. (2002) to define the founder PGC population at E7.25. We found that all the new steady EPI markers (Zp3, Mcf2, Hexb) are late stage germ cell markers.

Concurrently, the PGC marker Prdm14 is co-expressed with Tcfap2c, and promotes DNA demethylation in two ways: (i) by directly repressing the de novo DNA methyltransferases by recruiting polycomb repressive complex 2 (PRC2) to their promoters (Yamaji et al., 2013), and (ii) by accelerating the 10–11 translocation (TET)-mediated base excision repair (BER) pathway through the direct interaction of PRDM14with TET1 and TET2 (Okashita et al., 2014). We found that Tet1 is an E4.5-specific EPI transcript for fold change 2. We can speculate a molecular mechanism in which TCFAP2C and PRDM14 co-operate in repressing de novo methylation by the direct repression of Dnmt3l (by recruiting PRC2), and by accelerating the TET–BER pathway at E3.5 in EPIs. We can hypothesize that TCFAP2C and PRDM14 prepare the chromatin for possible preimplantation processes in EPI single cells at E4.5, which together with the very early activation of PGC markers could trigger the events toward very early priming of PGCs, something that needs to be investigated further.

The possibility that the transcription factor TCFAP2C could exert its regulation over Dnmt3l by repression instead of promoting expression, as revealed by their transcriptomics complimentary profiles (Fig. 4A and B), provides important molecular mechanics information in two ways. On one side, this complements the chromatin-immunoprecipitation results of Schemmer et al. (2013). Additionally, the ChIP-chip data of Kidder and Palmer (2010) in trophoblast stem cells show that TCFAP2C binds to Dnmt3b. The recent combined ChIP-seq and gene expression data of Park et al. (2015) in an activated Neu mouse model of mammary carcinogenesis shows that TCFAP2C binds and represses the expression of Dnmt1 and Dnmt3b, thus indicating that TCFAP2C has a trend to repress DNA methyltransferases. On the other side, this finding could help improve computational tools that systematically discover ab initio transcription-factor binding motifs and their potential binding sites (Elemento and Tavazoie, 2005; Ettwiller et al., 2005; Xie et al., 2005; Müller-Molina et al., 2012). Such tools still lack the capability to assess whether the predicted binding sites are associated with repressing or promoting expression events. Thus, the repressive regulation of TCFAP2C over Dnmt3l complements computational results such as of Luu et al. (2013), and could be used as learning data to train computational tools for binding site prediction, to annotate these binding sites with an assessment of their repression/expression character.

The newly found markers for PE transitions (Dnmt3l, Dusp4, Cpne8, Akap13, Dcaf12l1, Aaed1, B4galt6, BC100530, Rnpc3, Tfpi, Lgalsl, Ckap4, Fbxl20) and EPI transitions (Sgk1, Mal, Ubxn2a, Atg16l2, Gm13102, Tcfap2c, Hexb, Slc1a1, Svip, Liph, Mier3), and for stable PE (Sdc4, Cpn1, Dkk1, Havcr1, F2r/Par1, Slc7a6os) and EPI (Zp3, Mcf2, Hexb) states could facilitate preimplantation studies. A very high percentage of them, such as the PE's CPNE8, AKAP13, TFPI, CKAP4, B4GALT6, SDC4, HAVCR1 and F2R, and the EPI's MAL, SLC1A1, LIPH and HEXB, are membrane proteins (especially enriched in the early precursors), which pinpoints the importance of the cell-to-cell contacts and signaling events during the PE–EPI specification, and which will also facilitate the use of molecular biology tools based on surface protein recognition in later studies.

Among the identified PE and EPI markers, we have found several ERK1/2 pathway-related genes such as Dusp4, Akap13, Cpne8, Dcaf12l1, Tfpi (E3.5-specific PE), Sgk1 (E3.5-specific EPI), Klf2 (EPI marker). This is in line with previous results showing that ERK2 is critical for the regulation of self-renewal and propagation of early ESC populations (Meloche et al., 2004) through phosphorylation of KLF4 (Kim et al., 2012).

Collectively, our findings based on the higher precision of single-cell omics data rather than the bulk data, provide a more precise dating of some of the early developmental stages, including not only PE and EPI specification but also some unexpected very early signals for PGC activation. The latter could lead to discovery of new very early PGC markers that could be employed in the differentiation of PGCs from pluripotent populations with potential uses in reproduction medicine.

Supplementary data

Supplementary data are available at http://molehr.oxfordjournals.org/.

Authors' roles

D.G. and M.J.A.-B. performed the experiments, analyzed the data and wrote the manuscript. M.J.A.-B. designed the study.

Funding

This work was supported by grants DFG15/14 and DFG15/020 from Diputación Foral de Gipuzkoa, and grant II14/00016 from I+D+I National Plan 2013–2016 and FEDER funds.

Conflict of interest

None declared.

References

Artus
J
,
Panthier
JJ
,
Hadjantonakis
AK
.
A role for PDGF signaling in expansion of the extra-embryonic endoderm lineage of the mouse blastocyst
.
Development
 
2010
;
137
:
3361
3372
.
Auman
HJ
,
Nottoli
T
,
Lakiza
O
,
Winger
Q
,
Donaldson
S
,
Williams
T
.
Transcription factor AP-2γ is essential in the extra-embryonic lineages for early postimplantation development
.
Development
 
2002
;
129
:
2733
2747
.
Bedzhov
I
,
Graham
SJ
,
Leung
CY
,
Zernicka-Goetz
M
.
Developmental plasticity, cell fate specification and morphogenesis in the early mouse embryo
.
Philos Trans R Soc Lond B Biol Sci
 
2014
;
369
:
1657
.
Review
.
Borgel
J
,
Guibert
S
,
Li
Y
,
Chiba
H
,
Schübeler
D
,
Sasaki
H
,
Forné
T
,
Weber
M
.
Targets and dynamics of promoter DNA methylation during early mouse development
.
Nat Genet
 
2010
;
42
:
1093
1100
.
Bourc'his
D
,
Xu
G-L
,
Lin
C-S
,
Bollman
B
,
Bestor
TH
.
Dnmt3L and the establishment of maternal genomic imprints
.
Science
 
2001
;
294
:
2536
2539
.
Cao
F
,
Fukuda
A
,
Watanabe
H
,
Kono
T
.
The transcriptomic architecture of mouse Sertoli cell clone embryos reveals temporal–spatial-specific reprogramming
.
Reproduction
 
2013
;
145
:
277
288
.
Chazaud
C
,
Yamanaka
Y
,
Pawson
T
,
Rossant
J
.
Early lineage segregation between epiblast and primitive endoderm in mouse blastocysts through the Grb2-MAPK pathway
.
Dev Cell
 
2006
;
10
:
615
624
.
Chiquoine
AD
.
The identification, origin and migration of the primordial germ cells in the mouse embryo
.
Anat Rec
 
1956
;
118
:
135
146
.
Choi
I
,
Carey
TS
,
Wilson
CA
,
Knott
JG
.
Transcription factor AP-2γ is a core regulator of tight junction biogenesis and cavity formation during mouse early embryogenesis
.
Development
 
2012
;
139
:
4623
4632
.
Deng
Q
,
Ramsköld
D
,
Reinius
B
,
Sandberg
R
.
Single-cell RNA-seq reveals dynamic, random monoallelic gene expression in mammalian cells
.
Science
 
2014
;
343
:
193
196
.
Elemento
O
,
Tavazoie
S
.
Fast and systematic genome-wide discovery of conserved regulatory elements using a non-alignment based approach
.
Genome Biol
 
2005
;
6
:
R18
.
Ettwiller
L
,
Paten
B
,
Souren
M
,
Loosli
F
,
Wittbrodt
J
,
Birney
E
.
The discovery, positioning and verification of a set of transcription-associated motifs in vertebrates
.
Genome Biol
 
2005
;
6
:
R104
.
Galland
F
,
Pirisi
V
,
de Lapeyriere
O
,
Birnbaum
D
.
Restriction and complexity of Mcf2 proto-oncogene expression
.
Oncogene
 
1991
;
6
:
833
839
.
Gallardo
TD
,
John
GB
,
Shirley
L
,
Contreras
CM
,
Akbay
EA
,
Haynie
JM
,
Ward
SE
,
Shidler
MJ
,
Castrillon
DH
.
Genomewide discovery and classification of candidate ovarian fertility genes in the mouse
.
Genetics
 
2007
;
177
:
179
194
.
Ginsburg
M
,
Snow
MH
,
McLaren
A
.
Primordial germ cells in the mouse embryo during gastrulation
.
Development
 
1990
;
110
:
521
528
.
Giritharan
G
,
Li
MW
,
Di Sebastiano
F
,
Esteban
FJ
,
Horcajadas
JA
,
Lloyd
KC
,
Donjacour
A
,
Maltepe
E
,
Rinaudo
PF
.
Effect of ICSI on gene expression and development of mouse preimplantation embryos
.
Hum Reprod
 
2010
;
25
:
3012
3024
.
Gowher
H
,
Liebert
K
,
Hermann
A
,
Xu
G
,
Jeltsch
A
.
Mechanism of stimulation of catalytic activity of Dnmt3A and Dnmt3B DNA-(cytosine-C5)-methyltransferases by Dnmt3L
.
J Biol Chem
 
2005
;
280
:
13341
13348
.
Grindberg
RV
,
Yee-Greenbaum
JL
,
McConnell
MJ
,
Novotny
M
,
O'Shaughnessy
AL
,
Lambert
GM
,
Araúzo-Bravo
MJ
,
Lee
J
,
Fishman
M
,
Robbins
GE
et al
.
RNA-sequencing from single nuclei
.
Proc Natl Acad Sci U S A
 
2013
;
110
:
19802
19807
.
Guenatri
M
,
Duffié
R
,
Iranzo
J
,
Fauque
P
,
Bourc'his
D
.
Plasticity in Dnmt3L-dependent and -independent modes of de novo methylation in the developing mouse embryo
.
Development
 
2013
;
140
:
562
572
.
Hamatani
T
,
Carter
MG
,
Sharov
AA
,
Ko
MS
.
Dynamics of global gene expression changes during mouse preimplantation development
.
Dev Cell
 
2004
;
6
:
117
131
.
Hata
K
,
Okano
M
,
Lei
H
,
Li
E
.
Dnmt3L cooperates with the Dnmt3 family of de novo DNA methyltransferases to establish maternal imprints in mice
.
Development
 
2002
;
129
:
1983
1993
.
Inoue
N
,
Hess
KD
,
Moreadith
RW
,
Richardson
LL
,
Handel
MA
,
Watson
ML
,
Zinn
AR
.
New gene family defined by MORC, a nuclear protein required for mouse spermatogenesis
.
Hum Mol Genet
 
1999
;
8
:
1201
1207
.
Irizarry
RA
,
Bolstad
BM
,
Collin
F
,
Cope
LM
,
Hobbs
B
,
Speed
TP
.
Summaries of Affymetrix GeneChip probe level data
.
Nucleic Acids Res
 
2013
;
31
:
e15
.
Jin
J
,
Cardozo
T
,
Lovering
RC
,
Elledge
SJ
,
Pagano
M
,
Harper
JW
.
Systematic analysis and nomenclature of mammalian F-box proteins
.
Genes Dev
 
2004
;
18
:
2573
2580
.
Kaneda
M
,
Okano
M
,
Hata
K
,
Sado
T
,
Tsujimoto
N
,
Li
E
,
Sasaki
H
.
Essential role for de novo DNA methyltransferase Dnmt3a in paternal and maternal imprinting
.
Nature
 
2004
;
429
:
900
903
.
Kaneda
M
,
Hirasawa
R
,
Chiba
H
,
Okano
M
,
Li
E
,
Sasaki
H
.
Genetic evidence for Dnmt3a-dependent imprinting during oocyte growth obtained by conditional knockout with Zp3-Cre and complete exclusion of Dnmt3b by chimera formation
.
Genes Cells
 
2010
;
15
:
169
179
.
Kidder
BL
,
Palmer
S
.
Examination of transcriptional networks reveals an important role for TCFAP2C, SMARCA4, and EOMES in trophoblast stem cell maintenance
.
Genome Res
 
2010
;
20
:
458
472
.
Kim
MO
,
Kim
SH
,
Cho
YY
,
Nadas
J
,
Jeong
CH
,
Yao
K
,
Kim
DJ
,
Yu
DH
,
Keum
YS
,
Lee
KY
et al
.
ERK1 and ERK2 regulate embryonic stem cell self-renewal through phosphorylation of Klf4
.
Nat Struct Mol Biol
 
2012
;
19
:
283
290
.
King
AG
,
Ozanne
BW
,
Smythe
C
,
Ashworth
A
.
Isolation and characterisation of a uniquely regulated threonine, tyrosine phosphatase (TYP 1) which inactivates ERK2 and p54jnk
.
Oncogene
 
1995
;
11
:
2553
2563
.
Kuckenberg
P
,
Buhl
S
,
Woynecki
T
,
van Fürden
B
,
Tolkunova
E
,
Seiffe
F
,
Moser
M
,
Tomilin
A
,
Winterhager
E
,
Schorle
H
.
The transcription factor TCFAP2C/AP-2gamma cooperates with CDX2 to maintain trophectoderm formation
.
Mol Cell Biol
 
2010
;
30
:
3310
3320
.
Kurimoto
K
,
Yabuta
Y
,
Ohinata
Y
,
Ono
Y
,
Uno
KD
,
Yamada
RG
,
Ueda
HR
,
Saitou
M
.
An improved single-cell cDNA amplification method for efficient high-density oligonucleotide microarray analysis
.
Nucleic Acids Res
 
2006
;
34
:
e42
.
Lange
UC
,
Saitou
M
,
Western
PS
,
Barton
SC
,
Surani
MA
.
The fragilis interferon-inducible gene family of transmembrane proteins is associated with germ cell specification in mice
.
BMC Dev Biol
 
2003
;
19
:
3:1
.
Lawson
KA
,
Hage
WJ
.
Clonal analysis of the origin of primordial germ cells in the mouse
.
Ciba Found Symp
 
1994
;
182
:
68
91
.
Lawson
KA
,
Dunn
NR
,
Roelen
BA
,
Zeinstra
LM
,
Davis
AM
,
Wright
CV
,
Korving
JP
,
Hogan
BL
.
Bmp4 is required for the generation of primordial germ cells in the mouse embryo
.
Genes Dev
 
1999
;
13
:
424
436
.
Lieven
O
,
Knobloch
J
,
Rüther
U
.
The regulation of Dkk1 expression during embryonic development
.
Dev Biol
 
2010
;
340
:
256
268
.
Liu
H
,
Yu
J
,
Xia
T
,
Xiao
Y
,
Zhang
Q
,
Liu
B
,
Guo
Y
,
Deng
J
,
Deng
Y
,
Chen
S
et al
.
Hepatic serum- and glucocorticoid-regulated protein kinase 1 (SGK1) regulates insulin sensitivity in mice via extracellular-signal-regulated kinase 1/2 (ERK1/2)
.
Biochem J
 
2014
;
464
:
281
289
.
Lunsford
RD
,
Jenkins
NA
,
Kozak
CA
,
Liang
LF
,
Silan
CM
,
Copeland
NG
,
Dean
J
.
Genomic mapping of murine Zp-2 and Zp-3, two oocyte-specific loci encoding zona pellucida proteins
.
Genomics
 
1990
;
6
:
184
187
.
Luu
PL
,
Schöler
HR
,
Araúzo-Bravo
MJ
.
Disclosing the crosstalk among DNA methylation, transcription factors, and histone marks in human pluripotent cells through discovery of DNA methylation motifs
.
Genome Res
 
2013
;
23
:
2013
2029
.
Maekawa
M
,
Yamamoto
T
,
Kohno
M
,
Takeichi
M
,
Nishida
E
.
Requirement for ERK MAP kinase in mouse preimplantation development
.
Development
 
2007
;
134
:
2751
2759
.
Maitra
R
,
Grigoryev
DN
,
Bera
TK
,
Pastan
IH
,
Lee
B
.
Cloning, molecular characterization, and expression analysis of Copine 8
.
Biochem Biophys Res Commun
 
2003
;
303
:
842
847
.
Meloche
S
,
Vella
FD
,
Voisin
L
,
Ang
SL
,
Saba-El-Leil
M
.
Erk2 signaling and early embryo stem cell self-renewal
.
Cell Cycle
 
2004
;
3
:
241
2413
.
Mikedis
MM
,
Downs
KM
.
Collagen type IV and Perlecan exhibit dynamic localization in the Allantoic Core Domain, a putative stem cell niche in the murine allantois
.
Dev Dyn
 
2009
;
238
:
3193
3204
.
Miner
JH
,
Li
C
,
Mudd
JL
,
Go
G
,
Sutherland
AE
.
Compositional and structural requirements for laminin and basement membranes during mouse embryo implantation and gastrulation
.
Development
 
2004
;
131
:
2247
2256
.
Moriwaki
K
,
Tsukita
S
,
Furuse
M
.
Tight junctions containing claudin 4 and 6 are essential for blastocyst formation in preimplantation mouse embryos
.
Dev Biol
 
2007
;
312
:
509
522
.
Morris
SA
,
Teo
RT
,
Li
H
,
Robson
P
,
Glover
DM
,
Zernicka-Goetz
M
.
Origin and formation of the first two distinct cell types of the inner cell mass in the mouse embryo
.
Proc Natl Acad Sci USA
 
2010
;
107
:
6364
6369
.
Morris
SA
,
Graham
SJ
,
Jedrusik
A
,
Zernicka-Goetz
M
.
The differential response to Fgf signalling in cells internalized at different times influences lineage segregation in preimplantation mouse embryos
.
Open Biol
 
2013
;
3
:
130104
.
Müller-Molina
AJ
,
Schöler
HR
,
Araúzo-Bravo
MJ
.
Comprehensive human transcription factor binding site map for combinatory binding motifs discovery
.
PLoS One
 
2012
;
7
:
e49086
.
Nakamura
T
,
Yabuta
Y
,
Okamoto
I
,
Aramaki
S
,
Yokobayashi
S
,
Kurimoto
K
,
Sekiguchi
K
,
Nakagawa
M
,
Yamamoto
T
,
Saitou
M
.
SC3-seq: a method for highly parallel and quantitative measurement of single-cell gene expression
.
Nucleic Acids Res
 
2015
;
43
:
e60
.
Niakan
KK
,
Ji
H
,
Maehr
R
,
Vokes
SA
,
Rodolfa
KT
,
Sherwood
RI
,
Yamaki
M
,
Dimos
JT
,
Chen
AE
,
Melton
DA
et al
.
Sox17 promotes differentiation in mouse embryonic stem cells by directly regulating extraembryonic gene expression and indirectly antagonizing self-renewal
.
Genes Dev
 
2010
;
24
:
312
326
.
Ohnishi
Y
,
Huber
W
,
Tsumura
A
,
Kang
M
,
Xenopoulos
P
,
Kurimoto
K
,
Oleś
AK
,
Araúzo-Bravo
MJ
,
Saitou
M
,
Hadjantonakis
AK
et al
.
Cell-to-cell expression variability followed by signal reinforcement progressively segregates early mouse lineages
.
Nat Cell Biol
 
2014
;
16
:
27
37
.
Okano
M
,
Bell
DW
,
Haber
DA
,
Li
E
.
DNA methyltransferases Dnmt3a and Dnmt3b are essential for de novo methylation and mammalian development
.
Cell
 
1999
;
99
:
247
257
.
Okashita
N
,
Kumaki
Y
,
Ebi
K
,
Nishi
M
,
Okamoto
Y
,
Nakayama
M
,
Hashimoto
S
,
Nakamura
T
,
Sugasawa
K
,
Kojima
N
et al
.
PRDM14 promotes active DNA demethylation through the ten-eleven translocation (TET)-mediated base excision repair pathway in embryonic stem cells
.
Development
 
2014
;
141
:
269
280
.
Pantaleon
M
,
Scott
J
,
Kaye
PL
.
Nutrient sensing by the early mouse embryo: hexosamine biosynthesis and glucose signaling during preimplantation development
.
Biol Reprod
 
2008
;
78
:
595
600
.
Park
JM
,
Wu
T
,
Cyr
AR
,
Woodfield
GW
,
De Andrade
JP
,
Spanheimer
PM
,
Li
T
,
Sugg
SL
,
Lal
G
,
Domann
FE
et al
.
The role of Tcfap2c in tumorigenesis and cancer growth in an activated Neu model of mammary carcinogenesis
.
Oncogene
 
2015
;
34
:
6105
6114
.
Pastor
WA
,
Stroud
H
,
Nee
K
,
Liu
W
,
Pezic
D
,
Manakov
S
,
Lee
SA
,
Moissiard
G
,
Zamudio
N
,
Bourc'his
D
et al
.
MORC1 represses transposable elements in the mouse male germline
.
Nat Commun
 
2014
;
5
:
5795
.
Provençal
M
,
Michaud
M
,
Beaulieu
E
,
Ratel
D
,
Rivard
GE
,
Gingras
D
,
Béliveau
R
.
Tissue factor pathway inhibitor (TFPI) interferes with endothelial cell migration by inhibition of both the Erk pathway and focal adhesion proteins
.
Thromb Haemost
 
2008
;
99
:
576
585
.
Rankin
T
,
Familari
M
,
Lee
E
,
Ginsberg
A
,
Dwyer
N
,
Blanchette-Mackie
J
,
Drago
J
,
Westphal
H
,
Dean
J
.
Mice homozygous for an insertional mutation in the Zp3 gene lack a zona pellucida and are infertile
.
Development
 
1996
;
122
:
2903
2910
.
Reinhardt
P
,
Schmid
B
,
Burbulla
LF
,
Schöndorf
DC
,
Wagner
L
,
Glatza
M
,
Höing
S
,
Hargus
G
,
Heck
SA
,
Dhingra
A
et al
.
Genetic correction of a LRRK2 mutation in human iPSCs links parkinsonian neurodegeneration to ERK-dependent changes in gene expression
.
Cell Stem Cell
 
2013
;
12
:
354
367
.
Roller
RJ
,
Kinloch
RA
,
Hiraoka
BY
,
Li
SS
,
Wassarman
PM
.
Gene expression during mammalian oogenesis and early embryogenesis: quantification of three messenger RNAs abundant in fully grown mouse oocytes
.
Development
 
1989
;
106
:
251
261
.
Ron
D
,
Tronick
SR
,
Aaronson
SA
,
Eva
A
.
Molecular cloning and characterization of the human dbl proto-oncogene: evidence that its overexpression is sufficient to transform NIH/3T3 cells
.
EMBO J
 
1988
;
7
:
2465
2473
.
Sabour
D
,
Araúzo-Bravo
MJ
,
Hübner
K
,
Ko
K
,
Greber
B
,
Gentile
L
,
Stehling
M
,
Schöler
HR
.
Identification of genes specific to mouse primordial germ cells through dynamic global gene expression
.
Hum Mol Genet
 
2011
;
20
:
115
125
.
Saitou
M
,
Yamaji
M
.
Primordial germ cells in mice
.
Cold Spring Harb Perspect Biol
 
2012
;
4
:
a008375
.
Saitou
M
,
Barton
SC
,
Surani
MA
.
A molecular programme for the specification of germ cell fate in mice
.
Nature
 
2002
;
418
:
293
300
.
Saliba
A-E
,
Westermann
AJ
,
Gorsski
SA
,
Vogel
J
.
Single-cell RNA-seq: advances and future challenges
.
Nucleic Acids Res
 
2014
;
42
:
8845
8860
.
Review
.
Schemmer
J
,
Araúzo-Bravo
MJ
,
Haas
N
,
Schäfer
S
,
Weber
SN
,
Becker
A
,
Eckert
D
,
Zimmer
A
,
Nettersheim
D
,
Schorle
H
.
Transcription factor TFAP2C regulates major programs required for murine fetal germ cell maintenance and haploinsufficiency predisposes to teratomas in male mice
.
PLoS One
 
2013
;
8
:
e71113
.
Schrode
N
,
Xenopoulos
P
,
Piliszek
A
,
Frankenberg
S
,
Plusa
B
,
Hadjantonakis
AK
.
Anatomy of a blastocyst: cell behaviors driving cell fate choice and morphogenesis in the early mouse embryo
.
Genesis
 
2013
;
51
:
219
233
.
Review
.
Shapiro
E
,
Biezuner
T
,
Linnarsson
S
.
Single-cell sequencing-based technologies will revolutionize whole-organism science
.
Nat Rev Genet
 
2013
;
14
:
618
630
.
Review
.
Smith
FD
,
Langeberg
LK
,
Cellurale
C
,
Pawson
T
,
Morrison
DK
,
Davis
RJ
,
Scott
JD
.
AKAP-Lbc enhances cyclic AMP control of the ERK1/2 cascade
.
Nat Cell Biol
 
2010
;
12
:
1242
1249
.
Smith
ZD
,
Chan
MM
,
Mikkelsen
TS
,
Gu
H
,
Gnirke
A
,
Regev
A
,
Meissner
A
.
A unique regulatory phase of DNA methylation in the early mammalian embryo
.
Nature
 
2012
;
484
:
339
344
.
Stanghellini
I
,
Falco
G
,
Lee
SL
,
Monti
M
,
Ko
MS
.
Trim43a, Trim43b, and Trim43c: Novel mouse genes expressed specifically in mouse preimplantation embryos
.
Gene Expr Patterns
 
2009
;
9
:
595
602
.
Subramanian
A
,
Tamayo
P
,
Mootha
VK
,
Mukherjee
S
,
Ebert
BL
,
Gillette
MA
,
Paulovich
A
,
Pomeroy
SL
,
Golub
TR
,
Lander
ES
et al
.
Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles
.
Proc Natl Acad Sci USA
 
2005
;
102
:
15545
15550
.
Sugawa
F
,
Araúzo-Bravo
MJ
,
Yoon
J
,
Kim
KP
,
Aramaki
S
,
Wu
G
,
Stehling
M
,
Psathaki
OE
,
Hübner
K
,
Schöler
HR
.
Human primordial germ cell commitment in vitro associates with a unique PRDM14 expression profile
.
EMBO J
 
2015
;
34
:
1009
1024
.
Tang
F
,
Lao
K
,
Surani
MA
.
Development and applications of single-cell transcriptome analysis
.
Nat Methods
 
2011
;
8
:
S6
S11
.
Review
.
Watson
ML
,
Zinn
AR
,
Inoue
N
,
Hess
KD
,
Cobb
J
,
Handel
MA
,
Halaban
R
,
Duchene
CC
,
Albright
GM
,
Moreadith
RW
.
Identification of morc (microrchidia), a mutation that results in arrest of spermatogenesis at an early meiotic stage in the mouse
.
Proc Natl Acad Sci USA
 
1998
;
95
:
14361
14366
.
Winger
Q
,
Huang
J
,
Auman
HJ
,
Lewandoski
M
,
Williams
T
.
Analysis of transcription factor AP-2 expression and function during mouse preimplantation development
.
Biol Reprod
 
2006
;
75
:
324
333
.
Xie
X
,
Lu
J
,
Kulbokas
EJ
,
Golub
TR
,
Mootha
V
,
Lindblad-Toh
K
,
Lander
ES
,
Kellis
M
.
Systematic discovery of regulatory motifs in human promoters and 3′ UTRs by comparison of several mammals
.
Nature
 
2005
;
434
:
338
345
.
Yamaji
M
,
Seki
Y
,
Kurimoto
K
,
Yabuta
Y
,
Yuasa
M
,
Shigeta
M
,
Yamanaka
K
,
Ohinata
Y
,
Saitou
M
.
Critical function of Prdm14 for the establishment of the germ cell lineage in mice
.
Nat Genet
 
2008
;
40
:
1016
1022
.
Yamaji
M
,
Ueda
J
,
Hayashi
K
,
Ohta
H
,
Yabuta
Y
,
Kurimoto
K
,
Nakato
R
,
Yamada
Y
,
Shirahige
K
,
Saitou
M
.
PRDM14 ensures naive pluripotency through dual regulation of signaling and epigenetic pathways in mouse embryonic stem cells
.
Cell Stem Cell
 
2013
;
12
:
368
382
.
Yamanaka
Y
,
Lanner
F
,
Rossant
J
.
FGF signal-dependent segregation of primitive endoderm and epiblast in the mouse blastocyst
.
Development
 
2010
;
137
:
715
724
.
Yeo
JC
,
Jiang
J
,
Tan
ZY
,
Yim
GR
,
Ng
JH
,
Göke
J
,
Kraus
P
,
Liang
H
,
Gonzales
KA
,
Chong
HC
et al
.
Klf2 is an essential factor that sustains ground state pluripotency
.
Cell Stem Cell
 
2014
;
14
:
864
872
.