-
PDF
- Split View
-
Views
-
Cite
Cite
Xingjie Hao, Kai Wang, Chengguqiu Dai, Zeyang Ding, Wei Yang, Chaolong Wang, Shanshan Cheng, Integrative analysis of scRNA-seq and GWAS data pinpoints periportal hepatocytes as the relevant liver cell types for blood lipids, Human Molecular Genetics, Volume 29, Issue 18, 15 September 2020, Pages 3145–3153, https://doi.org/10.1093/hmg/ddaa188
- Share Icon Share
Abstract
Liver, a heterogeneous tissue consisting of various cell types, is known to be relevant for blood lipid traits. By integrating summary statistics from genome-wide association studies (GWAS) of lipid traits and single-cell transcriptome data of the liver, we sought to identify specific cell types in the liver that were most relevant for blood lipid levels. We conducted differential expression analyses for 40 cell types from human and mouse livers in order to construct the cell-type specifically expressed gene sets, which we refer to as construction of the liver cell-type specifically expressed gene sets (CT-SEGS). Under the assumption that CT-SEGS represented specific functions of each cell type, we applied stratified linkage disequilibrium score regression to determine cell types that were most relevant for complex traits and diseases. We first confirmed the validity of this method (of delineating functionally relevant cell types) by identifying the immune cell types as relevant for autoimmune diseases. We further showed that lipid GWAS signals were enriched in the human and mouse periportal hepatocytes. Our results provide important information to facilitate future cellular studies of the metabolic mechanism affecting blood lipid levels.
Introduction
The liver is a heterogeneous tissue critical for metabolic and immune functions. The cellular organization of the liver is based on the building block of the hepatic acinus with different cell types inside. Among these cell types, hepatocytes make up the largest proportion of cell populations, and play an important role in metabolic, secretory and endocrine functions (1). Kupffer cells are the liver resident macrophages and have been described as the immunological sentinels of the liver (2–4). Hepatic stellate cells in the space of Disse are the main storage site for vitamin A, and are the major contributor to liver fibrosis (5). Some liver-infiltrating lymphocytes, including B cells, T cells and natural killer cells, are distributed in specific patterns, while many details remain unknown in terms of cellular locations and functions of these lymphocytes (6,7).
As an important tissue for metabolic and immune processes, the liver is constantly exposed to gut-derived dietary and microbial antigens, and is thus relevant for many complex traits and diseases (8). The bile acids produced by the liver are necessary for the breakdown of fat and emulsification of lipids. In addition, the liver plays important roles in many other metabolic processes, including regulation of glycogen storage, decomposition of red blood cells, and production of hormones. Among the solid organs in the body, the liver has the largest population of tissue-resident macrophages, which can affect the progression of liver diseases (3,6).
Genome-wide association studies (GWAS) of blood lipid levels, including high-density lipoprotein (HDL), low-density lipoprotein (LDL), total cholesterol (TC) and triglycerides (TG), have identified numerous association signals enriched in the liver-specific expressed genes and the liver-specific epigenetic modification regions (9–15). Although it is clear that the liver is a relevant tissue for blood lipid levels, the resolution is insufficient to guide subsequent experiments to connect the GWAS results to cellular functions, because of the heterogeneity of cell type composition in the liver.
Recently, single-cell RNA sequencing (scRNA-seq) has emerged as a useful approach to quantify the transcriptome of individual cells and to cluster cells into putative populations based on their gene expressions (16,17). Many cell populations with different functions have been identified in human and mouse livers by scRNA-seq (6,7,18–21). Integrative analyses of scRNA-seq data and GWAS results have helped pinpoint the most relevant brain cell types for several neurological disorders (15,22–24) and insomnia (25). Given that the liver is a heterogeneous tissue consisting of various cell populations with distinct functions (6,7,18–21), identification of specific liver cell types relevant for blood lipid levels will provide essential information for future functional and cellular studies to understand the metabolism of lipids. Thus, in this study, we performed integrative analysis of GWAS summary statistics of blood lipid levels and scRNA-seq data from the human liver and the mouse liver, with the goal to pinpoint specific liver cell types relevant to blood lipid levels.
Results
Construction of the liver cell-type specifically expressed gene sets (CT-SEGS)
We selected 186 and 274 upregulated genes for each cell type from human and mouse livers, respectively, to construct the CT-SEGS (Fig. 1 and Methods). We provided a global picture of the sharing of upregulated genes across cell types of human and mouse liver by using the generalized Jaccard similarity index (i.e. the size of the intersection divided by the size of the union of the sample sets), a measure of overlap between upregulated genes in two cell types (Fig. 2). The average cross-cell-type Jaccard similarity indices were 0.009 in human liver CT-SEGS and 0.008 in mouse liver CT-SEGS. In addition, the average Jaccard similarity index was 0.003 between human and mouse liver CT-SEGS (Fig. 2). Through hierarchical clustering, similar cell types from both human and mouse liver tended to cluster together, broadly forming four cell type groups (Supplementary Note): endothelial cells, immune cells, hepatocyte cells and epithelial cells. For example, the endothelial cell group contained four cell types (i.e. central venous liver sinusoidal endothelial cells (LSECs), periportal (PP) LSECs, portal endothelial cells and stellate cells from the human liver), and the epithelial cell group contained three cell types (i.e. cholangiocytes from human liver, epithelia cell_Spp1 high and epithelial cell from the mouse liver) (Figs 2 and 3). The average Jaccard similarity indices were 0.189, 0.062, 0.085 and 0.092 within each cell type group (endothelial cell, immune cell, hepatocyte cell and epithelial cell).

The flowchart of the study. For each cell type in the human and mouse liver scRNA-seq gene expression matrix, we used the likelihood-ratio test for differential expression for each gene. We then select the upregulated genes, rank the genes by the P values, take the top genes and add a 100-kb window to get a genome annotation. We used stratified LD score regression to test whether this annotation is significantly enriched for per-SNP heritability, conditional on the baseline model and the set of all genes.

Jaccard Index of overlap among the upregulated genes in different cell types in human and mouse liver. Hierarchical clustering is used for clustering and ordering the different cell types. The colors of the text indicate the origin of the liver cell types, and the colors under the text indicate the group categories of the liver cell types.

The enrichment of genetic signals of autoimmune diseases for human and mouse liver cell-type specifically expressed gene sets (CT-SEGS). The enrichment results of inflammatory bowel disease (A), multiple sclerosis (B), primary biliary cirrhosis (C), rheumatoid arthritis (D) and type 1 diabetes (C) for the human and mouse liver CT-SEGS. The gold dash lines are the Bonferroni significance thresholds (P < 0.05/4) after adjusting four cell type groups in the human and mouse liver. The order of the cell types is based on the hierarchical clustering of the Jaccard Index of overlap among the upregulated genes.
Application to autoimmune diseases to confirm the functional relevance of CT-SEGS
To check the biological specificity of our liver CT-SEGS, we applied the stratified linkage disequilibrium score regression (LDSC) enrichment analyses to five well studied autoimmune diseases, including inflammatory bowel disease, multiple sclerosis, primary biliary cirrhosis, rheumatoid arthritis and type 1 diabetes (Supplementary Material, Table S1). As expected, we found that the GWAS signals of autoimmune diseases tended to be enriched in both human and mouse liver immune CT-SEGS (Table 1 and Fig. 3). In particular, we found no enrichment in the hepatocytes for primary biliary cholangitis (Fig. 3C), a chronic disease in which the bile ducts in the liver are slowly destroyed. Interestingly, in addition to immune cells, we also identified cholangiocytes, the epithelial cells in bile ducts, as a top relevant cell type for primary biliary cholangitis (P = 0.0188) (Table 1).
Estimates of the enrichment coefficient of human and mouse liver CT-SEGS for human autoimmune diseases
Liver . | Trait . | Cell type . | |$\tau$| . | SE(|$\tau$|) . | P value . |
---|---|---|---|---|---|
Human | IBD | NK-like cells | 8.63E-08 | 4.87E-08 | 0.0382 |
Mouse | IBD | B cell_Jchain high | 7.21E-08 | 4.08E-08 | 0.0388 |
Mouse | IBD | T cell_Gzma high | 1.05E-07 | 5.79E-08 | 0.0344 |
Mouse | IBD | T cell_Trbc2 high | 1.55E-07 | 7.34E-08 | 0.0176 |
Human | MS | NK-like cells | 2.09E-07 | 6.21E-08 | 0.0004 |
Human | MS | CD3+ αβ T cells | 6.46E-08 | 3.37E-08 | 0.0278 |
Human | MS | γδ T cells 1 | 1.74E-07 | 5.69E-08 | 0.0011 |
Mouse | MS | Granulocyte | 7.05E-08 | 3.02E-08 | 0.0099 |
Mouse | MS | Erythroblast_Hbb-bs high | 1.01E-07 | 4.31E-08 | 0.0093 |
Human | MS | Mature B cells | 7.33E-08 | 4.07E-08 | 0.0358 |
Mouse | MS | B cell_Fcmr high | 5.62E-08 | 3.26E-08 | 0.0423 |
Mouse | MS | Dendritic cell_Cst3 high | 1.09E-07 | 4.73E-08 | 0.0104 |
Mouse | PBC | Kupffer cell | 1.17E-07 | 7.07E-08 | 0.0492 |
Mouse | PBC | B cell_Jchain high | 1.56E-07 | 7.59E-08 | 0.0198 |
Mouse | PBC | T cell_Trbc2 high | 1.74E-07 | 9.94E-08 | 0.0398 |
Human | PBC | Cholangiocytes | 1.57E-07 | 7.53E-08 | 0.0188 |
Human | RA | Mature B cells | 6.38E-08 | 2.80E-08 | 0.0113 |
Mouse | RA | B cell_Jchain high | 6.53E-08 | 2.57E-08 | 0.0056 |
Mouse | RA | T cell_Trbc2 high | 8.07E-08 | 3.75E-08 | 0.0156 |
Mouse | RA | B cell_Fcmr high | 6.77E-08 | 2.62E-08 | 0.0049 |
Human | T1D | NK-like cells | 1.34E-07 | 4.96E-08 | 0.0035 |
Human | T1D | CD3+ αβ T cells | 1.05E-07 | 4.27E-08 | 0.0069 |
Human | T1D | Mature B cells | 7.43E-08 | 3.94E-08 | 0.0296 |
Mouse | T1D | T cell_Gzma high | 6.97E-08 | 3.79E-08 | 0.0328 |
Mouse | T1D | T cell_Trbc2 high | 1.08E-07 | 5.22E-08 | 0.0190 |
Mouse | T1D | B cell_Fcmr high | 1.17E-07 | 3.96E-08 | 0.0015 |
Mouse | T1D | Dendritic cell_Cst3 high | 8.91E-08 | 3.03E-08 | 0.0016 |
Mouse | T1D | Hepatocyte_mt-Nd4 high | 6.32E-08 | 3.39E-08 | 0.0313 |
Mouse | T1D | Epithelial cell | 6.42E-08 | 3.55E-08 | 0.0354 |
Liver . | Trait . | Cell type . | |$\tau$| . | SE(|$\tau$|) . | P value . |
---|---|---|---|---|---|
Human | IBD | NK-like cells | 8.63E-08 | 4.87E-08 | 0.0382 |
Mouse | IBD | B cell_Jchain high | 7.21E-08 | 4.08E-08 | 0.0388 |
Mouse | IBD | T cell_Gzma high | 1.05E-07 | 5.79E-08 | 0.0344 |
Mouse | IBD | T cell_Trbc2 high | 1.55E-07 | 7.34E-08 | 0.0176 |
Human | MS | NK-like cells | 2.09E-07 | 6.21E-08 | 0.0004 |
Human | MS | CD3+ αβ T cells | 6.46E-08 | 3.37E-08 | 0.0278 |
Human | MS | γδ T cells 1 | 1.74E-07 | 5.69E-08 | 0.0011 |
Mouse | MS | Granulocyte | 7.05E-08 | 3.02E-08 | 0.0099 |
Mouse | MS | Erythroblast_Hbb-bs high | 1.01E-07 | 4.31E-08 | 0.0093 |
Human | MS | Mature B cells | 7.33E-08 | 4.07E-08 | 0.0358 |
Mouse | MS | B cell_Fcmr high | 5.62E-08 | 3.26E-08 | 0.0423 |
Mouse | MS | Dendritic cell_Cst3 high | 1.09E-07 | 4.73E-08 | 0.0104 |
Mouse | PBC | Kupffer cell | 1.17E-07 | 7.07E-08 | 0.0492 |
Mouse | PBC | B cell_Jchain high | 1.56E-07 | 7.59E-08 | 0.0198 |
Mouse | PBC | T cell_Trbc2 high | 1.74E-07 | 9.94E-08 | 0.0398 |
Human | PBC | Cholangiocytes | 1.57E-07 | 7.53E-08 | 0.0188 |
Human | RA | Mature B cells | 6.38E-08 | 2.80E-08 | 0.0113 |
Mouse | RA | B cell_Jchain high | 6.53E-08 | 2.57E-08 | 0.0056 |
Mouse | RA | T cell_Trbc2 high | 8.07E-08 | 3.75E-08 | 0.0156 |
Mouse | RA | B cell_Fcmr high | 6.77E-08 | 2.62E-08 | 0.0049 |
Human | T1D | NK-like cells | 1.34E-07 | 4.96E-08 | 0.0035 |
Human | T1D | CD3+ αβ T cells | 1.05E-07 | 4.27E-08 | 0.0069 |
Human | T1D | Mature B cells | 7.43E-08 | 3.94E-08 | 0.0296 |
Mouse | T1D | T cell_Gzma high | 6.97E-08 | 3.79E-08 | 0.0328 |
Mouse | T1D | T cell_Trbc2 high | 1.08E-07 | 5.22E-08 | 0.0190 |
Mouse | T1D | B cell_Fcmr high | 1.17E-07 | 3.96E-08 | 0.0015 |
Mouse | T1D | Dendritic cell_Cst3 high | 8.91E-08 | 3.03E-08 | 0.0016 |
Mouse | T1D | Hepatocyte_mt-Nd4 high | 6.32E-08 | 3.39E-08 | 0.0313 |
Mouse | T1D | Epithelial cell | 6.42E-08 | 3.55E-08 | 0.0354 |
CT-SEGS passing the Bonferroni significance thresholds are highlighted in bold. Only cell types with P < 0.05 for enrichment tests are listed. CT-SEGS, cell-type specifically expressed gene sets; IBD, inflammatory bowel disease; MS, multiple sclerosis; PBC, primary biliary cirrhosis; RA, rheumatoid arthritis; SE, standard error; T1D, type 1 diabetes.
Estimates of the enrichment coefficient of human and mouse liver CT-SEGS for human autoimmune diseases
Liver . | Trait . | Cell type . | |$\tau$| . | SE(|$\tau$|) . | P value . |
---|---|---|---|---|---|
Human | IBD | NK-like cells | 8.63E-08 | 4.87E-08 | 0.0382 |
Mouse | IBD | B cell_Jchain high | 7.21E-08 | 4.08E-08 | 0.0388 |
Mouse | IBD | T cell_Gzma high | 1.05E-07 | 5.79E-08 | 0.0344 |
Mouse | IBD | T cell_Trbc2 high | 1.55E-07 | 7.34E-08 | 0.0176 |
Human | MS | NK-like cells | 2.09E-07 | 6.21E-08 | 0.0004 |
Human | MS | CD3+ αβ T cells | 6.46E-08 | 3.37E-08 | 0.0278 |
Human | MS | γδ T cells 1 | 1.74E-07 | 5.69E-08 | 0.0011 |
Mouse | MS | Granulocyte | 7.05E-08 | 3.02E-08 | 0.0099 |
Mouse | MS | Erythroblast_Hbb-bs high | 1.01E-07 | 4.31E-08 | 0.0093 |
Human | MS | Mature B cells | 7.33E-08 | 4.07E-08 | 0.0358 |
Mouse | MS | B cell_Fcmr high | 5.62E-08 | 3.26E-08 | 0.0423 |
Mouse | MS | Dendritic cell_Cst3 high | 1.09E-07 | 4.73E-08 | 0.0104 |
Mouse | PBC | Kupffer cell | 1.17E-07 | 7.07E-08 | 0.0492 |
Mouse | PBC | B cell_Jchain high | 1.56E-07 | 7.59E-08 | 0.0198 |
Mouse | PBC | T cell_Trbc2 high | 1.74E-07 | 9.94E-08 | 0.0398 |
Human | PBC | Cholangiocytes | 1.57E-07 | 7.53E-08 | 0.0188 |
Human | RA | Mature B cells | 6.38E-08 | 2.80E-08 | 0.0113 |
Mouse | RA | B cell_Jchain high | 6.53E-08 | 2.57E-08 | 0.0056 |
Mouse | RA | T cell_Trbc2 high | 8.07E-08 | 3.75E-08 | 0.0156 |
Mouse | RA | B cell_Fcmr high | 6.77E-08 | 2.62E-08 | 0.0049 |
Human | T1D | NK-like cells | 1.34E-07 | 4.96E-08 | 0.0035 |
Human | T1D | CD3+ αβ T cells | 1.05E-07 | 4.27E-08 | 0.0069 |
Human | T1D | Mature B cells | 7.43E-08 | 3.94E-08 | 0.0296 |
Mouse | T1D | T cell_Gzma high | 6.97E-08 | 3.79E-08 | 0.0328 |
Mouse | T1D | T cell_Trbc2 high | 1.08E-07 | 5.22E-08 | 0.0190 |
Mouse | T1D | B cell_Fcmr high | 1.17E-07 | 3.96E-08 | 0.0015 |
Mouse | T1D | Dendritic cell_Cst3 high | 8.91E-08 | 3.03E-08 | 0.0016 |
Mouse | T1D | Hepatocyte_mt-Nd4 high | 6.32E-08 | 3.39E-08 | 0.0313 |
Mouse | T1D | Epithelial cell | 6.42E-08 | 3.55E-08 | 0.0354 |
Liver . | Trait . | Cell type . | |$\tau$| . | SE(|$\tau$|) . | P value . |
---|---|---|---|---|---|
Human | IBD | NK-like cells | 8.63E-08 | 4.87E-08 | 0.0382 |
Mouse | IBD | B cell_Jchain high | 7.21E-08 | 4.08E-08 | 0.0388 |
Mouse | IBD | T cell_Gzma high | 1.05E-07 | 5.79E-08 | 0.0344 |
Mouse | IBD | T cell_Trbc2 high | 1.55E-07 | 7.34E-08 | 0.0176 |
Human | MS | NK-like cells | 2.09E-07 | 6.21E-08 | 0.0004 |
Human | MS | CD3+ αβ T cells | 6.46E-08 | 3.37E-08 | 0.0278 |
Human | MS | γδ T cells 1 | 1.74E-07 | 5.69E-08 | 0.0011 |
Mouse | MS | Granulocyte | 7.05E-08 | 3.02E-08 | 0.0099 |
Mouse | MS | Erythroblast_Hbb-bs high | 1.01E-07 | 4.31E-08 | 0.0093 |
Human | MS | Mature B cells | 7.33E-08 | 4.07E-08 | 0.0358 |
Mouse | MS | B cell_Fcmr high | 5.62E-08 | 3.26E-08 | 0.0423 |
Mouse | MS | Dendritic cell_Cst3 high | 1.09E-07 | 4.73E-08 | 0.0104 |
Mouse | PBC | Kupffer cell | 1.17E-07 | 7.07E-08 | 0.0492 |
Mouse | PBC | B cell_Jchain high | 1.56E-07 | 7.59E-08 | 0.0198 |
Mouse | PBC | T cell_Trbc2 high | 1.74E-07 | 9.94E-08 | 0.0398 |
Human | PBC | Cholangiocytes | 1.57E-07 | 7.53E-08 | 0.0188 |
Human | RA | Mature B cells | 6.38E-08 | 2.80E-08 | 0.0113 |
Mouse | RA | B cell_Jchain high | 6.53E-08 | 2.57E-08 | 0.0056 |
Mouse | RA | T cell_Trbc2 high | 8.07E-08 | 3.75E-08 | 0.0156 |
Mouse | RA | B cell_Fcmr high | 6.77E-08 | 2.62E-08 | 0.0049 |
Human | T1D | NK-like cells | 1.34E-07 | 4.96E-08 | 0.0035 |
Human | T1D | CD3+ αβ T cells | 1.05E-07 | 4.27E-08 | 0.0069 |
Human | T1D | Mature B cells | 7.43E-08 | 3.94E-08 | 0.0296 |
Mouse | T1D | T cell_Gzma high | 6.97E-08 | 3.79E-08 | 0.0328 |
Mouse | T1D | T cell_Trbc2 high | 1.08E-07 | 5.22E-08 | 0.0190 |
Mouse | T1D | B cell_Fcmr high | 1.17E-07 | 3.96E-08 | 0.0015 |
Mouse | T1D | Dendritic cell_Cst3 high | 8.91E-08 | 3.03E-08 | 0.0016 |
Mouse | T1D | Hepatocyte_mt-Nd4 high | 6.32E-08 | 3.39E-08 | 0.0313 |
Mouse | T1D | Epithelial cell | 6.42E-08 | 3.55E-08 | 0.0354 |
CT-SEGS passing the Bonferroni significance thresholds are highlighted in bold. Only cell types with P < 0.05 for enrichment tests are listed. CT-SEGS, cell-type specifically expressed gene sets; IBD, inflammatory bowel disease; MS, multiple sclerosis; PBC, primary biliary cirrhosis; RA, rheumatoid arthritis; SE, standard error; T1D, type 1 diabetes.
Lipid GWAS signals were enriched in PP hepatocyte functional regions
Finally, we analyzed four lipid traits (HDL, LDL, TC and TG) by integrating the scRNA-seq profiles of human and mouse livers. We identified human Hep 3 and Hep 5 as the relevant cell types for blood lipid levels after Bonferroni correction for multiple tests (P < 0.05/n, where n = 4 is the number of cell type groups) (Table 2 and Fig. 4). We adjusted for the number of cell type groups rather than the number of cell types because gene expression levels in cell types from the same group were highly correlated and shared some specifically expressed genes (Fig. 2). Specifically, GWAS signals for HDL were enriched in human Hep 3, while the GWAS signals for LDL, TC and TG were significantly enriched in human Hep 5. In addition, among the four hepatocyte cell types in mouse liver, GWAS signals for blood lipids were only nominally significantly (P < 0.05) enriched in mouse PP hepatocyte cells. Specially, we found that mouse PP hepatocyte cells were significantly (P = 1.05E-4) relevant for the lipid traits after meta-analysis of four lipid traits simultaneously (Supplementary Note). To test the robustness of our results, we also changed the selection criteria of upregulated genes in CT-SEGS by applying a P value threshold rather than fixing the number of genes (details were provided in Supplementary Note). We found that the GWAS signals for HDL were still enriched in human Hep 3, and the GWAS signals for LDL, TC and TG were nominally significantly (P < 0.05) enriched in human Hep 5 and mouse PP hepatocyte cells (Supplementary Note). Except for hepatocytes from human and mouse, we did not identify any other cell types (e.g. immune cell types) whose CT-SEGS were enriched for lipid GWAS signals (Fig. 4).
Estimates of the enrichment coefficient of human and mouse liver CT-SEGS for blood lipid levels
Liver . | Trait . | Cell type . | |$\tau$| . | SE(|$\tau$|) . | P value . |
---|---|---|---|---|---|
Mouse | HDL | Dendritic cell_Siglech high | 3.02E-08 | 1.79E-08 | 0.0456 |
Mouse | HDL | Periportal (PP) hepatocyte | 3.08E-08 | 1.81E-08 | 0.0442 |
Human | HDL | Hep 3 | 8.66E-08 | 3.72E-08 | 0.0099 |
Human | HDL | Hep 5 | 5.01E-08 | 2.44E-08 | 0.0200 |
Human | HDL | Hep 2 | 2.87E-08 | 1.34E-08 | 0.0158 |
Mouse | LDL | PP hepatocyte | 3.71E-08 | 2.05E-08 | 0.0355 |
Mouse | LDL | Hepatocyte_Fabp1 high | 2.88E-08 | 1.64E-08 | 0.0393 |
Human | LDL | Hep 5 | 1.18E-07 | 4.13E-08 | 0.0022 |
Mouse | TC | Neutrophil_Ngp high | 3.58E-08 | 1.90E-08 | 0.0300 |
Mouse | TC | PP hepatocyte | 5.38E-08 | 2.51E-08 | 0.0159 |
Human | TC | Hep 3 | 7.70E-08 | 4.21E-08 | 0.0338 |
Human | TC | Hep 4 | 8.08E-08 | 4.20E-08 | 0.0272 |
Human | TC | Hep 5 | 1.17E-07 | 4.21E-08 | 0.0027 |
Human | TG | γδ T cells 2 | 2.69E-08 | 1.39E-08 | 0.0267 |
Human | TG | Non-inflammatory macrophages | 2.48E-08 | 1.32E-08 | 0.0305 |
Mouse | TG | Erythroblast_Hbb-bt high | 2.85E-08 | 1.35E-08 | 0.0172 |
Mouse | TG | PP hepatocyte | 4.31E-08 | 2.28E-08 | 0.0296 |
Human | TG | Hep 3 | 4.37E-08 | 2.58E-08 | 0.0452 |
Human | TG | Hep 5 | 6.46E-08 | 2.59E-08 | 0.0062 |
Human | TG | Hep 1 | 4.06E-08 | 1.90E-08 | 0.0160 |
Liver . | Trait . | Cell type . | |$\tau$| . | SE(|$\tau$|) . | P value . |
---|---|---|---|---|---|
Mouse | HDL | Dendritic cell_Siglech high | 3.02E-08 | 1.79E-08 | 0.0456 |
Mouse | HDL | Periportal (PP) hepatocyte | 3.08E-08 | 1.81E-08 | 0.0442 |
Human | HDL | Hep 3 | 8.66E-08 | 3.72E-08 | 0.0099 |
Human | HDL | Hep 5 | 5.01E-08 | 2.44E-08 | 0.0200 |
Human | HDL | Hep 2 | 2.87E-08 | 1.34E-08 | 0.0158 |
Mouse | LDL | PP hepatocyte | 3.71E-08 | 2.05E-08 | 0.0355 |
Mouse | LDL | Hepatocyte_Fabp1 high | 2.88E-08 | 1.64E-08 | 0.0393 |
Human | LDL | Hep 5 | 1.18E-07 | 4.13E-08 | 0.0022 |
Mouse | TC | Neutrophil_Ngp high | 3.58E-08 | 1.90E-08 | 0.0300 |
Mouse | TC | PP hepatocyte | 5.38E-08 | 2.51E-08 | 0.0159 |
Human | TC | Hep 3 | 7.70E-08 | 4.21E-08 | 0.0338 |
Human | TC | Hep 4 | 8.08E-08 | 4.20E-08 | 0.0272 |
Human | TC | Hep 5 | 1.17E-07 | 4.21E-08 | 0.0027 |
Human | TG | γδ T cells 2 | 2.69E-08 | 1.39E-08 | 0.0267 |
Human | TG | Non-inflammatory macrophages | 2.48E-08 | 1.32E-08 | 0.0305 |
Mouse | TG | Erythroblast_Hbb-bt high | 2.85E-08 | 1.35E-08 | 0.0172 |
Mouse | TG | PP hepatocyte | 4.31E-08 | 2.28E-08 | 0.0296 |
Human | TG | Hep 3 | 4.37E-08 | 2.58E-08 | 0.0452 |
Human | TG | Hep 5 | 6.46E-08 | 2.59E-08 | 0.0062 |
Human | TG | Hep 1 | 4.06E-08 | 1.90E-08 | 0.0160 |
CT-SEGS passing the Bonferroni significance thresholds are highlighted in bold. Only cell types with P < 0.05 for enrichment tests are listed. HDL, high-density lipoprotein; LDL, low-density lipoprotein; SE, standard error; TC, total cholesterol; TG, triglycerides.
Estimates of the enrichment coefficient of human and mouse liver CT-SEGS for blood lipid levels
Liver . | Trait . | Cell type . | |$\tau$| . | SE(|$\tau$|) . | P value . |
---|---|---|---|---|---|
Mouse | HDL | Dendritic cell_Siglech high | 3.02E-08 | 1.79E-08 | 0.0456 |
Mouse | HDL | Periportal (PP) hepatocyte | 3.08E-08 | 1.81E-08 | 0.0442 |
Human | HDL | Hep 3 | 8.66E-08 | 3.72E-08 | 0.0099 |
Human | HDL | Hep 5 | 5.01E-08 | 2.44E-08 | 0.0200 |
Human | HDL | Hep 2 | 2.87E-08 | 1.34E-08 | 0.0158 |
Mouse | LDL | PP hepatocyte | 3.71E-08 | 2.05E-08 | 0.0355 |
Mouse | LDL | Hepatocyte_Fabp1 high | 2.88E-08 | 1.64E-08 | 0.0393 |
Human | LDL | Hep 5 | 1.18E-07 | 4.13E-08 | 0.0022 |
Mouse | TC | Neutrophil_Ngp high | 3.58E-08 | 1.90E-08 | 0.0300 |
Mouse | TC | PP hepatocyte | 5.38E-08 | 2.51E-08 | 0.0159 |
Human | TC | Hep 3 | 7.70E-08 | 4.21E-08 | 0.0338 |
Human | TC | Hep 4 | 8.08E-08 | 4.20E-08 | 0.0272 |
Human | TC | Hep 5 | 1.17E-07 | 4.21E-08 | 0.0027 |
Human | TG | γδ T cells 2 | 2.69E-08 | 1.39E-08 | 0.0267 |
Human | TG | Non-inflammatory macrophages | 2.48E-08 | 1.32E-08 | 0.0305 |
Mouse | TG | Erythroblast_Hbb-bt high | 2.85E-08 | 1.35E-08 | 0.0172 |
Mouse | TG | PP hepatocyte | 4.31E-08 | 2.28E-08 | 0.0296 |
Human | TG | Hep 3 | 4.37E-08 | 2.58E-08 | 0.0452 |
Human | TG | Hep 5 | 6.46E-08 | 2.59E-08 | 0.0062 |
Human | TG | Hep 1 | 4.06E-08 | 1.90E-08 | 0.0160 |
Liver . | Trait . | Cell type . | |$\tau$| . | SE(|$\tau$|) . | P value . |
---|---|---|---|---|---|
Mouse | HDL | Dendritic cell_Siglech high | 3.02E-08 | 1.79E-08 | 0.0456 |
Mouse | HDL | Periportal (PP) hepatocyte | 3.08E-08 | 1.81E-08 | 0.0442 |
Human | HDL | Hep 3 | 8.66E-08 | 3.72E-08 | 0.0099 |
Human | HDL | Hep 5 | 5.01E-08 | 2.44E-08 | 0.0200 |
Human | HDL | Hep 2 | 2.87E-08 | 1.34E-08 | 0.0158 |
Mouse | LDL | PP hepatocyte | 3.71E-08 | 2.05E-08 | 0.0355 |
Mouse | LDL | Hepatocyte_Fabp1 high | 2.88E-08 | 1.64E-08 | 0.0393 |
Human | LDL | Hep 5 | 1.18E-07 | 4.13E-08 | 0.0022 |
Mouse | TC | Neutrophil_Ngp high | 3.58E-08 | 1.90E-08 | 0.0300 |
Mouse | TC | PP hepatocyte | 5.38E-08 | 2.51E-08 | 0.0159 |
Human | TC | Hep 3 | 7.70E-08 | 4.21E-08 | 0.0338 |
Human | TC | Hep 4 | 8.08E-08 | 4.20E-08 | 0.0272 |
Human | TC | Hep 5 | 1.17E-07 | 4.21E-08 | 0.0027 |
Human | TG | γδ T cells 2 | 2.69E-08 | 1.39E-08 | 0.0267 |
Human | TG | Non-inflammatory macrophages | 2.48E-08 | 1.32E-08 | 0.0305 |
Mouse | TG | Erythroblast_Hbb-bt high | 2.85E-08 | 1.35E-08 | 0.0172 |
Mouse | TG | PP hepatocyte | 4.31E-08 | 2.28E-08 | 0.0296 |
Human | TG | Hep 3 | 4.37E-08 | 2.58E-08 | 0.0452 |
Human | TG | Hep 5 | 6.46E-08 | 2.59E-08 | 0.0062 |
Human | TG | Hep 1 | 4.06E-08 | 1.90E-08 | 0.0160 |
CT-SEGS passing the Bonferroni significance thresholds are highlighted in bold. Only cell types with P < 0.05 for enrichment tests are listed. HDL, high-density lipoprotein; LDL, low-density lipoprotein; SE, standard error; TC, total cholesterol; TG, triglycerides.

The enrichment of genetic signals of lipid traits for human and mouse liver cell-type specifically expressed gene sets (CT-SEGS). The enrichment results of HDL (A), LDL (B), TC (C) and TG (D) for the human and mouse liver CT-SEGS. The gold dash lines are the Bonferroni significance thresholds (P < 0.05/4) after adjusting four cell type groups in the human and mouse liver. The order of the cell types is based on the hierarchical clustering of the Jaccard Index of overlap among the upregulated genes.
Discussion
Previous studies have identified the liver as a relevant tissue for blood lipid levels by integrating GWAS results with functional annotations, such as those derived from bulk RNA-seq and chromatin immunoprecipitation sequencing (9–14,26–28). These studies, however, had limited resolution to identify specific liver cell types, because the functional annotations were derived from bulk sequencing of thousands of heterogeneous cells in the liver tissue. In this study, we pinpointed specific liver cell types relevant for the lipid traits by leveraging two recent scRNA-seq datasets derived from human and mouse livers, respectively. As a proof of concept, we performed the same analysis and showed that in the liver only the immune cell types were relevant to autoimmune diseases, as expected.
Using upregulated genes as the specifically expressed gene sets has been shown to be effective in identifying relevant tissues and cell types for many complex diseases and traits (11,13,15,23). Because different tissues and cell types might share chromatin states in the epigenome map (29), it is common that gene expressions are correlated in similar cell types, as reflected by the larger Jaccard similarity indices within cell type groups than those across groups in our analysis (Fig. 2). Our results also confirmed that similar cell types from human and mouse analogous tissues were clustered together based on upregulated genes from scRNA-seq data (30). The small Jaccard similarity indices between most cell type pairs suggested the upregulated genes in the CT-SEGS could capture the specificity of each cell type, as confirmed by our analysis of the autoimmune diseases (Fig. 3). In addition, our stratified LDSC analysis tested each CT-SEGS annotation conditional on 53 general annotations and the set of all genes that were common to all cell types (details in Methods) (10,11). Finally, we have tested the robustness of our results with different selection criteria of upregulated genes in the CT-SEGS (Supplementary Note).
We found that GWAS signals for lipid traits were significantly enriched in the functional regions of Hep 3 and Hep 5 in the human liver (Table 2). As shown by the Jaccard similarity index in Fig. 2, human Hep 3 and Hep 5 cells share a larger proportion of specifically expressed genes with mouse PP hepatocyte cells than with mouse pericentral (PC) hepatocyte cells (0.147 vs 0.072 for Hep 3 and 0.217 vs 0.065 for Hep 5). Based on gene expression patterns between the human hepatocyte cell types and mouse zonated cell types, human Hep 3 and Hep 5 cells are transcriptionally similar to PP mouse layers (6,18). Consistently, we found that GWAS signals for lipid traits were nominally significantly (P < 0.05) enriched in mouse PP hepatocyte cells. We thus speculate that the lipid relevant liver cell types were likely located at the outer PP layers in the hepatic lobule. Interestingly, the spatial construction of mouse liver by scRNA-seq suggested that the mammalian liver has different zones to optimize the liver functions and to act in distinct processes (18). The outer PP layers in the hepatic lobule produced higher levels of enzymes for energy-demanding tasks such as gluconeogenesis and ureagenesis (18), consistent with their potential role in lipid metabolism, whereas the inner layers were specialized in glycolysis and xenobiotic metabolism.
In conclusion, we found that the PP hepatocyte cell types play an important role in blood lipid levels by integrative analyses of GWAS summary statistics and scRNA-seq of liver cell types. Our results provided important information for future cellular studies to investigate the metabolic processes underlying blood lipid levels. Nevertheless, the spatial distribution and functions of the liver cell populations have not been fully explored, such as human Hep 4 and Hep 5, which could limit the interpretations of our results and future experiments. In addition, the power to identify the relevant cell types may be limited by the correlation of gene expression patterns between similar cell types. New scRNA-seq technologies with improved sensitivity and precision to detect lowly expressed genes are required for the identification of more differentially expressed genes, and thus to help differentiate similar cell types in future studies.
Materials and Methods
GWAS summary statistics
The blood lipid level GWAS summary statistics were downloaded from http://csg.sph.umich.edu/willer/public/lipids2010 (31). Briefly, the lipid GWAS was based on >100 000 individuals of European ancestry, and identified 95 loci at the genome-wide significant level (P < 5E-8). Given that the liver is highly infiltrated with immune cells, we also downloaded and analyzed GWAS summary statistics of five autoimmune diseases for comparison, including inflammatory bowel disease, multiple sclerosis, primary biliary cirrhosis, rheumatoid arthritis and type 1 diabetes. All the GWAS results were derived from European samples (see Supplementary Material, Table S1 for details).
Cell-type specific expression
We constructed the specifically expressed gene set in certain human or mouse liver cell types using public datasets (6,7). Following previous studies (11,13,23), we chose the top upregulated genes in each focal cell type as the cell-type specifically expressed gene sets (CT-SEGS), which were not restricted to genes expressed only in the focal cell type (Fig. 1). Because the power to identify the differentially expressed genes depends on the sample size (i.e. the number of cells in each cell type) (32), the number of identified upregulated genes varied dramatically across different cell types. To avoid possible confounding caused by different cell type population sizes, we chose the same number of top upregulated genes for each cell type. To test the robustness of our results, we also constructed CT-SEGS using a fixed threshold of P < 0.05 in the differential expression analyses (described in this section), resulting in different number of upregulated genes in the CT-SEGS of different cell types (ranging from 50 to 1723; see Supplementary Note).
The human liver scRNA-seq was conducted on the 10X Genomics platform. Five cell type groups consisting of 20 cell types were identified among 8444 cells (6). We downloaded the human liver scRNA-seq counts (GSE115469), which were normalized using the default settings of the scran R package (33). To construct the specifically expressed gene sets, we performed 20 differential expression (DE) analyses for 17 900 genes using the likelihood-ratio test (34) implemented in Seurat 3.0 (35,36), each testing one cell type against the other 19 cell types. Similar to previous studies defining genes specifically expressed in a tissue (11,13,23), we selected the top 186 upregulated genes (fold change in log2 scale (logFC) > 0, expressed in > 10% of cells from the focal cell type) as the human CT-SEGS (Supplementary Material, Table S2) for each cell type. The number of 186 was determined based on the minimum number of upregulated genes across the 20 DE analyses of different human liver cell types. Among these genes, 88.0% had logFC > 0.1 and all of them had adjusted P value (false discovery rate [FDR]) < 0.05.
The mouse liver scRNA-seq was conducted on the Microwell-seq platform. Four cell type groups consisting of 20 cell types were identified among 4685 cells (7). We downloaded the raw sequencing counts data (https://figshare.com/articles/MCA_DGE_Data/5435866), and used an expert-curated human–mouse homolog list (http://www.informatics.jax.org/homology.shtml) to map mouse genes to their human homologs. We kept the genes with high mapping confidence (1:1 mapping). We normalized the counts data for the mappable genes using the default settings of the scran R package (33). We conducted DE analyses for the 11 942 genes as we did for the human liver scRNA-seq data. For each cell type, we selected the top 274 upregulated genes (the minimum number of upregulated genes across different mouse liver cell types) as the mouse liver CT-SEGS. Among these genes, 79.7% had logFC < 0.1 and 75.7% had FDR adjusted P value < 0.05 (Supplementary Material, Table S2).
Partitioning of trait heritability to CT-SEGS
Acknowledgments
We thank all the GWAS consortium studies for making the summary data publicly available and are grateful of all the investigators and participants contributed to those studies.
Conflict of Interest statement. The authors have declared no competing interests.
Funding
The National Natural Science Foundation of China (NSFC, 81973148, 82003561).
Author Contributions
XH conceived the study, performed the data collection and analysis. SC and CW supervised the study. XH drafted the manuscript with inputs from SC and CW. KW, DC, ZD and WY participated in data interpretation. All authors reviewed and approved the manuscript.