Abstract

Whole blood transcriptome analysis is a valuable approachin medical research, primarily due to the ease of sample collection and the richness of the information obtained. Since the expression profile of individual genes in the analysis is influenced by medical traits and demographic attributes such as age and gender, there has been a growing demand for a comprehensive database for blood transcriptome analysis. Here, we performed whole blood RNA sequencing (RNA-seq) analysis on 576 participants stratified by age (20–30s and 60–70s) and gender from cohorts of the Tohoku Medical Megabank (TMM). A part of female segment included pregnant women. We did not exclude the globin gene family in our RNA-seq study, which enabled us to identify instances of hereditary persistence of fetal hemoglobin based on the HBG1 and HBG2 expression information. Comparing stratified populations allowed us to identify groups of genes associated with age-related changes and gender differences. We also found that the immune response status, particularly measured by neutrophil-to-lymphocyte ratio (NLR), strongly influences the diversity of individual gene expression profiles in whole blood transcriptome analysis. This stratification has resulted in a data set that will be highly beneficial for future whole blood transcriptome analysis in the Japanese population.

Abbreviations

     
  • AHSP

    alpha hemoglobin stabilizing protein

  •  
  • BirThree

    birth and three-generation

  •  
  • CommCohort

    community-based cohort

  •  
  • DEG

    differentially expressed gene

  •  
  • DV200

    percentage of RNA greater than 200 nucleotides in length

  •  
  • eQTL

    expression quantitative trait locus

  •  
  • FDR

    false discovery rate

  •  
  • GTEx

    Genotype-Tissue Expression

  •  
  • HMBS

    hydroxymethylbilane synthase

  •  
  • HPFH

    hereditary persistence of fetal hemoglobin

  •  
  • IMM

    Iwate Tohoku Medical Megabank Organization

  •  
  • IRB

    institutional review board

  •  
  • NLR

    neutrophil-to-lymphocyte ratio

  •  
  • PCA

    principal component analysis

  •  
  • PC1

    first principal component

  •  
  • PC2

    second principal component

  •  
  • RIN

    RNA integrity number

  •  
  • RNA-seq

    RNA sequencing

  •  
  • rRNA

    ribosomal RNA

  •  
  • TMM

    Tohoku Medical Megabank

  •  
  • ToMMo

    Tohoku Medical Megabank Organization

  •  
  • TPM

    transcripts per million

  •  
  • WGS

    whole genome sequencing

In March 2011, a huge earthquake and tsunami devastated northeastern coast of Japan (1). In an effort to innovatively regenerate the tsunami-damaged regions, we established Tohoku Medical Megabank (TMM) Project (2, 3). This project aims to realize a next generation medicine through prospective genome cohort studies. To facilitate the TMM Project, Tohoku Medical Megabank Organization (ToMMo) was established at Tohoku University, and Iwate Tohoku Medical Megabank Organization (IMM) was established at Iwate Medical University. Both ToMMo and IMM have collaborated to create two types of cohorts (4). One is the community-based cohort (CommCohort Study), in which a total 84,073 participants were recruited (5). The other is the birth and three-generation cohort (BirThree Cohort Study), which included pregnant mothers, their partners, parents, newborn babies and their elder siblings (6). In the BirThree Cohort Study, a total of 73,500 participants were recruited from the obstetrics hospitals and clinics in the Miyagi Prefecture from 22,493 families.

Employing these two prospective cohorts strategically, we established TMM biobank, which houses biospecimens, health and clinical information, as well as genome-omics data (7). We carried out comprehensive health examination during the initial survey and have been conducting similar health examinations in the follow-up survey every five years. Importantly, we made up our mind to establish the TMM biobank as an integrated biobank. For this purpose, we have set up an analytical center within the TMM biobank that carries out large-scale genome and omics analyses (8). Through the analytical center, we have developed a Japanese whole-genome reference panel based on the whole genome sequence data (9) and a custom SNP array specially designed for the Japanese population (10).

In our analytical center, we are conducting a range of omics analyses, including transcriptomics, epigenomics, proteomics, metabolomics and metagenomics (1114). Our goal is to elucidate gene–environment interactions within the Japanese population and gather robust evidence that will facilitate the advancement of future medicine in Japan (15). Given the large-scale nature of the cohort samples we handle, we have strived to establish high-throughput procedures for omics analyses. Among the ongoing omics analyses, RNA sequencing (RNA-seq) or transcriptome analysis is particularly compelling. This method offers the ability to broadly survey and/or scrutinize gene expression profiles in a manner that whole genome sequencing (WGS) cannot provide (16). However, an inherent challenge of cohort transcriptome analysis lies in the limitation of available RNA sources.

Prospective genome cohorts are typically composed of healthy volunteers, which restrict the types of biological materials available for study. Therefore, we chose to use whole blood samples, as these are comparatively easy to collect from cohort participants. We hypothesized that transcriptome analysis of whole blood samples would provide extensive information reflective of individual health conditions (17). Indeed, transcriptome analysis of whole blood samples has been conducted globally (1821). For instance, the Genotype-Tissue Expression (GTEx) Project has determined gene expression profiles for various tissues, including whole blood (22). The whole blood transcriptome data from GTEx have been utilized to estimate gene expression levels in other tissues (23). Similarly, whole blood transcriptome analysis was carried out in a case-control study targeting COVID-19 in Japan (24).

However, whole blood gene expression analysis targeting healthy populations in prospective cohorts has yet to be conducted. This attempt is particularly important as uncertainty remains about the variations in gene expression profiles in relation to age and gender in whole blood transcriptomes. Additionally, there is no consensus methodologically on whether it would be more advantageous to remove globin mRNAs from the whole blood transcriptome (25). In an effort to address these gaps, we embarked on a large-scale transcriptome analysis using whole blood samples. These samples were collected from 576 female and male as well as young and elder participants of the TMM Project, with the aim of generating fundamental data about gene expression in the whole blood of the Japanese population.

Materials and Methods

Study design and sample selection

This project was performed as a part of the prospective cohort study at the TMM, with the approval of the institutional review board (IRB) at Tohoku University. The samples used here were from the cohort participants, all of whom gave TMM their written consent. In order to conduct whole blood transcriptome analysis, 576 participants' samples were selected randomly, with adjustments made for the gender and age. The 576 whole blood samples include 120 females in age of 20 to 30 years, 172 females in 60 to 70 years, 109 males in 20 to 30 years and 175 males in 60 to 70 years. Of note, 14 of the 120 females in the age of 20 to 30 years were during pregnancy. Cohort participants suffered from cancers, high blood pressure, diabetes mellitus and hyperlipidemia were excluded.

Total RNA extraction from PAXgene® blood RNA tubes

As a part of baseline analysis, we collected from 4337 participants approximately 2.5 mL whole blood in PAXgene® Blood RNA Tubes (#762165, BD Biosciences) containing 6.9 mL of the additive and stored them at −80°C. Total RNA was extracted using PAXgene® Blood RNA kit (#762174, BD Biosciences) according to the manufacture protocol, which included DNase I treatment to remove trace amounts of DNA. Concentrations of RNA samples were quantified by NanoDrop 2000 (Thermo Fisher Scientific). Quality of RNA samples was evaluated by means of RIN (RNA integrity number) and DV200 (percentage of RNA greater than 200 nucleotides in length) values using RNA ScreenTape and Reagents (#5067–5576 and #5067–5577, Agilent Technologies) by TapeStation 2200 (Agilent Technologies).

RNA-seq analysis

The extracted RNA samples were diluted to 11 ng/μl concentration in 96-well plates (20 μl or 220 ng/well). After the RNA was subjected to the removal of ribosomal RNA (rRNA) using rRNA Depletion Kit (#1000005953, MGI Tech), RNA libraries were prepared using a MGISP-960 high-throughput automated sample preparation system (MGI Tech). Through the RNA libraries, DNA libraries were constructed with MGIEasy RNA Directional Library Prep Set (#1000006386, MGI Tech) according to the manufacturer's protocol. The DNA libraries were sequenced with DNBSEQ-G400RS High-Throughput Sequencing Set (#1000016952 with FCL PE150 or #1000016995 with App-A FCL PE150, MGI Tech) using a DNBSEQ-G400 sequencer (MGI Tech).

Data analysis

Raw sequence reads were quality-controlled by removing low-quality bases using Trimmomatic (26) with default parameters. Trimmed reads were aligned to the human genome GRCh38 using STAR v2.5.3a (27) based on GENCODE release 28 gene annotations (28). Read count and expression level of individual genes estimated as TPM (Transcript per Million) (29) were calculated using RSEM v1.3.0 (30) with default parameters.

Expression levels of individual genes excluding the effect of globin genes were calculated based on the read count matrix, which removes components corresponding to 12 genes of the globin gene family (ENSG00000206172 HBA1, ENSG00000188536 HBA2, ENSG00000244734 HBB, ENSG00000229988 HBBP1, ENSG00000223609 HBD, ENSG00000213931 HBE1, ENSG00000213934 HBG1, ENSG00000196565 HBG2, ENSG00000206177 HBM, ENSG00000086506 HBQ1, ENSG00000130656 HBZ, ENSG00000206178 HBZP1), using in-house Python script (contents are available upon request).

Principal component analysis (PCA) was performed using expression profiles of the genes that were found to be expressed in all samples by using the prcomp function in the R environment. Differentially expressed genes (DEGs) were determined using the DESeq2 package (31) in the R environment with a false discovery rate (FDR) cutoff of 1%.

Whole blood transcriptome analysis toward the establishment of gene expression reference panel for Japanese population. (A) Overview of the study design. (B) Schematic diagram RNA-seq analysis flow.
Fig. 1

Whole blood transcriptome analysis toward the establishment of gene expression reference panel for Japanese population. (A) Overview of the study design. (B) Schematic diagram RNA-seq analysis flow.

Data availability

The gene expression statistics are available at the jMorp (https://jmorp.megabank.tohoku.ac.jp/) website via a web interface (15). The raw sequence data are under controlled access, as the sequence data contain information under restriction of the participants’ consent. The sequence data are available upon request after approval of IRB and Materials and Information Distribution Review Committee.

Results

Whole blood transcriptome analysis toward establishment of gene expression reference panel for Japanese population

To elucidate an age and gender-specific gene expression profile of the Japanese population using whole blood, we performed RNA-seq analysis of 576 blood samples, including 120 females in age of 20–30s, 172 females in 60–70s, 109 males in 20–30s and 175 males in 60–70s derived from the TMM biobank (Fig. 1A, left panel). Of note, 14 females out of the 120 females in age of 20–30s were during pregnancy. This is due to the nature of our cohort-biobank in which we have recruited pregnant women into the BirThree Cohort Study. In this study, we selected samples randomly from whole available participants.

Total RNA samples of sufficient quantity and quality for RNA-seq experiments were obtained from all 576 blood samples taken by the PAXgene® Blood RNA Tubes, which were cryopreserved in the TMM biobank for an extended period (Supplementary Fig. S1-S2, Table 1). It should be noted that in whole blood RNA-seq cases, both rRNA and globin mRNAs are generally depleted experimentally before sequencing to improve the quality of transcriptome data. However, in this study, we decided to exclude only rRNAs, but to include globin mRNAs in the transcriptome experiment to gain insight into the population-level expression characteristics of the globin gene family (Fig. 1A, middle panel).

As it has been reported that in silico removal of globin mRNA genes from sequencing results improves the quality of the transcriptome data, we processed the transcriptome data and prepared gene expression profiles of non-globin genes by excluding the reads assigned to the globin gene family (Fig. 1A, right panel). We subsequently evaluated the gene expression data set separately for globin and non-globin genes.

Schematic diagram of RNA-seq data analysis and approach of excluding globin mRNAs during the data analysis is shown in Figure 1B. In this protocol, raw sequence reads were first quality controlled by removing low-quality bases using a software Trimmomatic, and trimmed reads were aligned to the human genome GRCh38. The read count and expression level of individual genes estimated as TPM were calculated using RSEM v1.3.0, and the expression levels were calculated based on the non-globin genes only read count matrix using in-house python script. PCA was performed based on the expression profile of genes found to be expressed in all samples.

Characteristics of globin gene expression

Taking advantage of our approach for the whole blood transcriptome analysis, we first examined expression characteristics of the globin family genes. We found significant level of both HBA1/HBA2 (α1/α2 globin) and HBB (β globin) gene expressions, which encode the subunits of adult-type hemoglobin, and this was confirmed in all 576 samples (Table 2). In this transcriptome analysis, globin gene content varies sample to sample, but occupies 18.9% to 57.1%, showing very good agreement with that reported 20.0% to 62.6% (32). When we assessed relationship between the expression levels of each globin gene by TPM, we found a strong correlation between expression levels of the HBA1 and HBB genes (Fig. 2A). This indicates that transcript levels of HBA1 and HBB genes are well coordinated in adult peripheral blood. This observation is also consistent with the global gene co-expression data provided by COXPRESdb (https://coxpresdb.jp/locus/?gene_id=3043).

Table 1

Cohort RNA samples extracted from PAXgene® Blood RNA Tubes

Ave.SDMinMax
RNA (ng/μL)33.5±19.64.2160.0
RNA (μg)2.7±1.60.212.8
RIN7.9±0.64.99.1
DV20089.3±6.635.197.7
Ave.SDMinMax
RNA (ng/μL)33.5±19.64.2160.0
RNA (μg)2.7±1.60.212.8
RIN7.9±0.64.99.1
DV20089.3±6.635.197.7

RIN, RNA Integrity Number; DV200, percentage of RNA greater than 200 nt in length

Table 1

Cohort RNA samples extracted from PAXgene® Blood RNA Tubes

Ave.SDMinMax
RNA (ng/μL)33.5±19.64.2160.0
RNA (μg)2.7±1.60.212.8
RIN7.9±0.64.99.1
DV20089.3±6.635.197.7
Ave.SDMinMax
RNA (ng/μL)33.5±19.64.2160.0
RNA (μg)2.7±1.60.212.8
RIN7.9±0.64.99.1
DV20089.3±6.635.197.7

RIN, RNA Integrity Number; DV200, percentage of RNA greater than 200 nt in length

Table 2

Expression statistics of globin genes

GeneDetection NumberDetection Rate [%]Minimum TPMMedian TPMMaximum TPM
α-Globin LocusHBZ00.000.005.7646.83
HBM00.0017.1885.71699.21
HBA15761002.31 × 10056.34 × 10051.13 × 1006
HBA25761002.59 × 10048.49 × 10041.69 × 1005
HBQ00.008.4331.2785.35
β-Globin LocusHBE00.000.000.095.12
HBG271.2242.693.49 × 10035.28 × 1005
HBG120.3511.701.16 × 10031.31 × 1005
HBD00.005.2637.34574.46
HBB5761005.08 × 10041.60 × 10053.56 × 1005
GeneDetection NumberDetection Rate [%]Minimum TPMMedian TPMMaximum TPM
α-Globin LocusHBZ00.000.005.7646.83
HBM00.0017.1885.71699.21
HBA15761002.31 × 10056.34 × 10051.13 × 1006
HBA25761002.59 × 10048.49 × 10041.69 × 1005
HBQ00.008.4331.2785.35
β-Globin LocusHBE00.000.000.095.12
HBG271.2242.693.49 × 10035.28 × 1005
HBG120.3511.701.16 × 10031.31 × 1005
HBD00.005.2637.34574.46
HBB5761005.08 × 10041.60 × 10053.56 × 1005

Column ‘Detection Number’ means the number of samples in which read count composition of the gene in question exceeds 1%.

Table 2

Expression statistics of globin genes

GeneDetection NumberDetection Rate [%]Minimum TPMMedian TPMMaximum TPM
α-Globin LocusHBZ00.000.005.7646.83
HBM00.0017.1885.71699.21
HBA15761002.31 × 10056.34 × 10051.13 × 1006
HBA25761002.59 × 10048.49 × 10041.69 × 1005
HBQ00.008.4331.2785.35
β-Globin LocusHBE00.000.000.095.12
HBG271.2242.693.49 × 10035.28 × 1005
HBG120.3511.701.16 × 10031.31 × 1005
HBD00.005.2637.34574.46
HBB5761005.08 × 10041.60 × 10053.56 × 1005
GeneDetection NumberDetection Rate [%]Minimum TPMMedian TPMMaximum TPM
α-Globin LocusHBZ00.000.005.7646.83
HBM00.0017.1885.71699.21
HBA15761002.31 × 10056.34 × 10051.13 × 1006
HBA25761002.59 × 10048.49 × 10041.69 × 1005
HBQ00.008.4331.2785.35
β-Globin LocusHBE00.000.000.095.12
HBG271.2242.693.49 × 10035.28 × 1005
HBG120.3511.701.16 × 10031.31 × 1005
HBD00.005.2637.34574.46
HBB5761005.08 × 10041.60 × 10053.56 × 1005

Column ‘Detection Number’ means the number of samples in which read count composition of the gene in question exceeds 1%.

We found that TPM for HBA1 was approximately one order of magnitude higher than that of HBA2, which were 6.34 x1005 and 8.49 x1004 (medium TPM), respectively, indicating that HBA1 gene is mainly transcribed among a globin gene (Table 1). An intriguing observation here is that TPM for HBA1 was approximately 4-fold higher than that for HBB (Table 1 and Fig. 2A). Expression profiles of these globin genes showed almost normal distributions in 576 individuals (Fig. 2A). Consistent with these finding, similar expression profiles for these globin transcripts have been reported (33). Hematopoietic indices of these 576 participants showed no sign of anemia caused by chain imbalance of α and β globin chains (Fig. 2A). The reason(s) for the transcript level difference in peripheral blood transcriptome analysis remains to be clarified.

We also found expressions of HBG1 and HBG2 genes, which encode fetal β-type globin γA andγG, respectively. In this analysis, we assigned HBG1 and HBG2 positive if read count composition of these genes exceeds 1% of total read count composition. According to this criterion, HBG1 and HBG2 were found to be expressed in 2 (0.35%) and 7 (1.22%) of 576 participants in adults, respectively, indicating that our approach for whole blood transcriptome analysis successfully identified fetal γ globin genes in rare adults (Table 1 and Fig. 2B).

We found that two HBG1 positive cases showed high-level expressions. Especially, one case showed very high level of HBG1 and low level of HBB gene expression, while the other showed high level of HBG1 and moderate level of HBB gene expression (Fig. 2C). Expression of HBG2 in these two cases showed almost similar expression profiles, while rest of the five HBG1 positive cases showed moderate level of HBG1 TPM increase, but HBG2 level was not increased (Fig. 2D).

These globin gene expression profiles are collectively shown in Figure 2E. These results demonstrate that there are hereditary persistence of fetal hemoglobin (HPFH) cases in Japanese population and that our approach of RNA-seq including globin genes has a potential to detect these HPFH cases.

Whole blood transcriptome profile reflects cell contents in peripheral blood

In order to examine blood cell contents in whole blood sample, we next investigated gene expression profile of non-globin genes. Our hypothesis is that we can assess each white blood cell content through the expression profiles of lineage specific genes.

We examined overview of the gene expression profile based on PCA. To assess influence of the globin transcript removal, we first performed the PCA in the presence of globin gene transcript. We then recalculated the gene expression data after excluding the read information assigned to the globin gene group. The results are shown side-by-side, Figure 3A-C with globin transcript and Figure 3D–F without globin gene transcript, respectively.

When PCA was performed using data with globin transcripts, clusters highly related to age or gender group could not be observed in both transcript clusters (Fig. 3A and3D). We then examined functions of gene set, which showed strong associations to the first principal component (PC1) axis that showed high-level contribution rate of 48.8% (with globin transcript; Fig. 3B) and 28.6% (without globin transcript; Fig. 3E). Removal of globin transcript reduced the PC1 axis, perhaps due to the decrease of globin transcript contribution.

PCA with globin transcript detected functions related to ‘red blood cell function’ and ‘iron metabolism’, suggesting that the expression level information on the globin genes, which is in general excluded experimentally as a noise, has a strong influence on the entire data set (Fig. 3B). Indeed, a strong positive correlation (SCC = 0.87) was observed between HBA1 gene expression levels and the PC1 scores (Fig. 3C).

In PCA for data without the globin genes, functions detected by the PC1 were significantly different from those detected by the data with globin transcript (Fig. 3E). Comparison of PC1 data with and without globin genes showed that top 5 functions in the with-globin analysis were all disappeared in the without-globin gene analysis, which included ‘Erythrocytes take up carbon dioxide and release oxygen’, ‘Binding and uptake of ligands by scavenger receptors’ and ‘Heme signaling’ (Fig. 3B). We surmise that in the assignment of these functions, globin gene transcripts may contribute substantially.

Of note, functions related to the ‘neutrophil degranulation phenomenon’ was detected as a positively related function for PC1, while ‘Processing of capped intron-containing pre-mRNA’ ‘tRNA processing’ and ‘rRNA processing’ were detected as negatively related functions for PC1 (Fig. 3E). We selected C5AR1 (34) as a candidate transcript expressed specifically in neutrophils and examined the association between the expression level of C5AR1 and PC1 score. We found a positive correlation (SCC = 0.69) between the expression level of C5AR1 and the PC1 score (Fig. 3F), indicating that neutrophil-related genes are contributing to this association.

As the samples for this analysis were from the TMM biobank that has been collecting the data of blood cell components, we capitalized the data and conducted association studies between the PC1 score and number of various types of blood cells. As expected, we found a strong positive correlation (SCC = 0.80) between the neutrophils number and the PC1 score (Fig. 3G). In contrast, we found a strong negative correlation (SCC = -0.78) between the lymphocytes number and the PC1 score observed (Fig. 3H). This correlation was reproducible in the PC2 score of the with-globin gene transcript analysis (Table 2B), indicating the importance of globin gene transcript removal for blood cell component assessment. Notably, more strong correlation (SCC = 0.80) was also observed between the neutrophil-to-lymphocyte ratio (NLR), which is utilized as one of the indicators of immune function, and the PC1 score. These results support our contention that whole blood transcriptome results reflect contents of blood cells, which can be used to predict the proportion of the blood cell components in peripheral blood.

Substantial numbers of differentially expressed genes are found by age and gender

To understand the influence of age and gender to the gene expression profile in whole blood transcriptome, we characterized expressions of various genes in a population stratified by age and gender. To this end, we have used the transcript data without-globin-gene from 106 females in age of 20s to 30s (excluding 14 pregnant women), 172 females in age of 60–70s, 109 males in age of 20s to 30s and 175 males in 60s to 70s (Fig. 4A). In this figure, arrow A to B indicates higher level gene expression in B.

We first assessed influence of age on the whole blood transcriptome and identified 1309 genes for women (Fig. 4A, #1 plus #2) and 366 genes for men (#3 plus #4). Expression levels of these two groups of genes changed significantly with age (Fig. 4B). Of the 1309 genes identified in female, expressions of 687 genes were down-regulated, while 642 genes were up-regulated in the aged group. Of the 366 genes identified in male, expressions of 274 genes were down-regulated, while 92 genes were up-regulated in the aged group. Thus, the most salient finding here is that females have more significant gene expression changes with age than male do. In addition, males have three-times more down-regulated genes than up-regulated genes along with age, while females have almost similar numbers of down-regulated and up-regulated genes.

We then analyzed genes which expressions differed in gender and further characterized these genes in combination with age. The number of genes whose expression differed between females and males was 428 in their 20s to 30s (Fig. 4A, #5 plus #6) and 1129 in their 60s to 70s (#7 plus #8), indicating that the gender difference in gene expression in whole blood transcriptome increases with age. Of the 1129 genes, 609 showed higher level expression in males than in females, while 520 genes showed higher level expressions in females than in males (Fig. 4C). These results clearly demonstrate that number of genes with differential expression by gender increases approximately three times in the aged groups.

Functional characteristics of DEGs by age and gender

It seems of interest to examine functions of genes that resided in each category set in the previous section. We first examined genes influenced by age. As there are approximately 4 times more genes in females than in males in this category (Fig. 4B), we characterized age-influenced genes in female at first. In women, the hallmark gene sets showing age-dependent up-regulation include ‘UV RESPONSE DN’ and ‘HEME METABOLISM’ (Fig. 4D, #1), while genes down-regulated with age has no significantly enriched functions. The genes in each category are shown in Supplementary Table S1. In men, the hallmark gene sets showing age-dependent down-regulation include ‘G2M CHECKPOINT’, ‘E2F TARGETS’ and ‘MYC TARGETS V1’ (Fig. 4D, #4), while genes up-regulated with age has no significantly enriched functions.

We then conducted functional annotation of gene expressions that are changed by gender in combination with age. In the 20s to 30s, ‘HEME_METABOLISM’ was detected as functions of gene groups whose expression was higher in males than in females (Fig. 4E, #5), and ‘INTERFERON GAMMA RESPONSE’ and ‘INTERFERON ALPHA RESPONSE’ were detected as a function of genes whose expression levels were higher in females than in males (Fig. 4E, #6). It seems interesting to note that in the 60s to 70s, ‘REACTIVE OXYGEN SPECIES PATHWAY’ was detected as functions of genes whose expression levels were higher in males than in females (Fig. 4E, #7). Similarly, ‘HEME_METABOLISM’ was detected whose gene expression levels were higher in males than in females (Fig. 4E, #5), whereas ‘INTERFERON GAMMA RESPONSE’ and ‘INTERFERON ALPHA RESPONSE’ were detected as a function of genes whose expression levels were higher in females than in males (Fig. 4E, #6).

RNA sequencing of globin gene family has a potential to detect HPFH patients. (A) Relationship between HBA1 and HBB gene expression levels. TPM (Transcripts Per Million) values are used in the analysis. (B) Relationship between HBG1 and HBG2 gene expression levels. (C) Relationship between HBG1 and HBB gene expression levels. (D) Relationship between HBG2 and HBB gene expression levels. (E) Gene expression composition of globin family genes.
Fig. 2

RNA sequencing of globin gene family has a potential to detect HPFH patients. (A) Relationship between HBA1 and HBB gene expression levels. TPM (Transcripts Per Million) values are used in the analysis. (B) Relationship between HBG1 and HBG2 gene expression levels. (C) Relationship between HBG1 and HBB gene expression levels. (D) Relationship between HBG2 and HBB gene expression levels. (E) Gene expression composition of globin family genes.

Annotation of genes changed by age and gender

Given the significant differences in gene expression profiles we discovered by age and gender in the human whole blood transcriptome analysis, our next step was to scrutinize the expression profiles of individual genes within each category, stratified by age and gender. We aimed to compare overlaps or redundancies among DEGs. To achieve this, we refined the criteria for the DEG estimation procedure, applying FDR cutoff of 5% in this analysis. Consequently, we identified more genes than those analyzed in the previous section.

We began by comparing the differences in expressions of age-dependent genes between female and male groups. To accomplish this, we initially stratified gene expressions by age, further subdividing them by gender. This process resulted in nine groups (Fig. 5A). For instance, Group FY > E MY > E comprises genes with higher expressions in young generations (20–30s) compared to the elderly (60–70s), irrespective of gender.

Neutrophil and lymphocyte composition mainly affects the divergence of whole blood gene expression profile. (A) PCA of gene expression profile without in silico removal of globin gene counts. (B) Pathway enrichment analysis of strongly PC1-associated genes in A (|Factor Loading| > 0.8). (C) Correlation of HBA1 gene expression level with PC1 score in A. (D) PCA of gene expression profile after in silico removal of globin gene counts. (E) Pathway enrichment analysis of strongly PC1-associated genes in D (|Factor Loading| > 0.8). (F) Correlation of C5AR1 gene expression level with PC1 score in D. C5AR1 is selected as a neutrophil marker. (G) Correlation of neutrophil content with PC1 score in D. (H) Correlation of lymphocyte content with PC1 score in D. Note the inverse relationship of neutrophil and lymphocyte content to PC1 score. (I) Correlation of neutrophil-to-lymphocyte ratio (NLR) with PC1 score in D.
Fig. 3

Neutrophil and lymphocyte composition mainly affects the divergence of whole blood gene expression profile. (A) PCA of gene expression profile without in silico removal of globin gene counts. (B) Pathway enrichment analysis of strongly PC1-associated genes in A (|Factor Loading| > 0.8). (C) Correlation of HBA1 gene expression level with PC1 score in A. (D) PCA of gene expression profile after in silico removal of globin gene counts. (E) Pathway enrichment analysis of strongly PC1-associated genes in D (|Factor Loading| > 0.8). (F) Correlation of C5AR1 gene expression level with PC1 score in D. C5AR1 is selected as a neutrophil marker. (G) Correlation of neutrophil content with PC1 score in D. (H) Correlation of lymphocyte content with PC1 score in D. Note the inverse relationship of neutrophil and lymphocyte content to PC1 score. (I) Correlation of neutrophil-to-lymphocyte ratio (NLR) with PC1 score in D.

Gene expression alteration by age and gender in human whole blood. (A) Schematic summary of the gene expression alternation by age and gender. Note that dot and arrow represent the higher expression of the gene. (B) Number of age-associated genes stratified by gender. Numbers correspond to those in panel (A). (C) Numbers of gender-associated genes stratified by age group. (D) Gene set enrichment analysis against MSigDB hallmark gene sets for significantly age-related gene set as in (A). (E) Gene set enrichment analysis against MSigDB hallmark gene sets for of significantly gender-biased expression gene set as in (A).
Fig. 4

Gene expression alteration by age and gender in human whole blood. (A) Schematic summary of the gene expression alternation by age and gender. Note that dot and arrow represent the higher expression of the gene. (B) Number of age-associated genes stratified by gender. Numbers correspond to those in panel (A). (C) Numbers of gender-associated genes stratified by age group. (D) Gene set enrichment analysis against MSigDB hallmark gene sets for significantly age-related gene set as in (A). (E) Gene set enrichment analysis against MSigDB hallmark gene sets for of significantly gender-biased expression gene set as in (A).

Interplay of age and gender on whole blood gene expression. (A) Schematic diagram of gene classification based on age-related expression level alteration stratified by gender. F and M stand for female and male, respectively. Y and E stand for young (20–30s) and elder (60–70s), respectively. (B) Schematic diagram of gene classification based on gender-related expression level alteration stratified by age group. (C) Distribution of the mean normalized expression level per sample group for gender-biased age-related DEGs. (D) Gene set enrichment analysis against MSigDB hallmark gene sets for each category in C (FDR < 0.01). (E) Distribution of the mean normalized expression level per sample group for Age-biased Gender-related DEGs. (F) Gene set enrichment analysis against MSigDB hallmark gene sets for each category in (E) (FDR < 0.01).
Fig. 5

Interplay of age and gender on whole blood gene expression. (A) Schematic diagram of gene classification based on age-related expression level alteration stratified by gender. F and M stand for female and male, respectively. Y and E stand for young (20–30s) and elder (60–70s), respectively. (B) Schematic diagram of gene classification based on gender-related expression level alteration stratified by age group. (C) Distribution of the mean normalized expression level per sample group for gender-biased age-related DEGs. (D) Gene set enrichment analysis against MSigDB hallmark gene sets for each category in C (FDR < 0.01). (E) Distribution of the mean normalized expression level per sample group for Age-biased Gender-related DEGs. (F) Gene set enrichment analysis against MSigDB hallmark gene sets for each category in (E) (FDR < 0.01).

We then compared how expressions of genes exhibiting gender-dependent variations differ between young and elderly groups. For this analysis, we initially stratified gene expressions by gender and then further subdivided them based on age. This method also yielded nine distinct groups (Fig. 5B). For example, Group YF > M EF > M includes genes where expressions are higher in females than males, regardless of age.

How age-dependent gene expressions differ in female and male groups

In this analysis, we first examined gene expression variations with age and then listed them separately in female and male groups (Fig. 5C). Regardless of the gender, expression of 99 genes decreased with age (Group FY > E MY > E), while expression of 137 genes increased with age (Group FY < E MY < E). Thus, expressions of total 236 genes were changed either positively or negatively with age irrelevant to the gender.

Effect of age and gender on the expression of oxidative and inflammatory stresses related genes. (A) Expression pattern for the REACTIVE_OXYGEN_SPECIES_PATHWAY genes. Asterisk (*) indicates genes that are categorized into Group YF = M EF < M in Figure 5D. Expression pattern of 4 genes in the pathway are shown; (B) NFE2L2 (NRF2), (C) GLRX (Glutaredoxin), (D) TXN (Thioredoxin) and (E) PDLIM1 (PDZ and Lim domain protein 1). Data from pregnant females in B–E also show pregnancy-induced changes of these gene expressions. P, Pregnant, NP, Non-Pregnant.
Fig. 6

Effect of age and gender on the expression of oxidative and inflammatory stresses related genes. (A) Expression pattern for the REACTIVE_OXYGEN_SPECIES_PATHWAY genes. Asterisk (*) indicates genes that are categorized into Group YF = M EF < M in Figure 5D. Expression pattern of 4 genes in the pathway are shown; (B) NFE2L2 (NRF2), (C) GLRX (Glutaredoxin), (D) TXN (Thioredoxin) and (E) PDLIM1 (PDZ and Lim domain protein 1). Data from pregnant females in B–E also show pregnancy-induced changes of these gene expressions. P, Pregnant, NP, Non-Pregnant.

In contrast, if we restricted the gene expression changes in either female or male, much wider range of changes could be seen. In case of females, expressions of 1270 genes were found to decrease (Group FY > E MY = E), while expressions of 1282 genes were increased with age (Group FY < E MY = E). In case of males, numbers of genes in this category were markedly smaller than those in females; expressions of 433 genes (Group FY = E MY > E) decreased and expressions of 88 genes (Group FY = E MY < E) increased with age. Only one gene each for Groups FY > E MY < E and FY < E MY > E was found in this analysis, which correspond to genes whose expression increase in male but decrease in female and whose expression decrease in male but increase in female, respectively. Therefore, we did not show these two groups in the Figure 5C.

We then explored enriched functions and found nine functions in Groups FY > E MY = E, FY < E MY = E and FY = E MY > E (Fig. 5D). In the Group FY > E MY = E in which age-related decrease of gene expression occurred most drastically in females, ‘IL6 JAK STAT3 SIGNALING’, ‘PROTEIN SECRETION’, ‘INFLAMMATORY RESPONSE’, ‘COMPLEMENT’ and ‘MTORC1 SIGNALING’ were identified (Fig. 5D). In the Group FY < E MY = E in which age-related increase of gene expression occurred most drastically in females, ‘HEME METABOLISM’ was detected (Fig. 5D). This shows very good coincidence with that of #1 group in Figure 4D. In the Group FY = E MY > E in which age-related decrease of gene expression occurred most drastically in males, ‘G2M CHECKPOINT’, ‘E2F TARGETS’ and ‘MYC TARGETS V2’ were identified (Fig. 5D). Again, this shows very good coincidence with that of #4 group in Figure 4D.

How gender-dependent gene expressions differ in young and elderly groups

We next examined gene expression variations with gender in young and elderly generation groups (Fig. 5E). In this analysis, we first examined gene expression variations with gender and then listed them separately in age groups. Regardless of age, 74 genes were highly expressed in female (Group YF > M EF > M), while 120 genes were highly expressed in male (Group YF < M EF < M). In contrast, 216 genes were found to be expressed more in females than males in the young generation (Group YF > M EF = M), and 458 genes were found to be expressed more in males than females in the young generation (Group YF < M EF = M), indicating that expressions of these two groups of genes differ in gender only in the young generation.

In the 60s, 1140 genes showed higher level expression in females than in males (Group YF = M EF > M) and 1142 genes showed higher level expression in males than in females (Group YF = M EF < M). These results indicate that a number of genes show difference in their expression in both genders. This finding coincides well with the finding of Figure 4C.

When the enriched functions of each gene group were investigated, ‘INTERFERON GAMMA RESPONSE’, ‘INTERFERON ALPHA RESPONSE’ and ‘INFLAMMATORY RESPONSE’ were found in the Group YF > M EF = M (Fig. 5F). For the Group YF < M EF = M, ‘HEME METABOLISM’ and ‘EPITHELIAL MESENCHYMAL TRANSITION’ functions were significantly enriched (Fig. 5F). For the Group YF = M EF > M, ‘ALLOGRAFT REJECTION’, ‘ESTROGEN RESPONSE EARLY’ and ‘MYC TARGETS V2’ were detected as enriched functions (Fig. 5F). For the Group YF = M EF < M, ‘MTORC1 SIGNALING’, ‘OXIDATIVE PHOSPHORYLATION’, ‘REACTIVE OXYGEN SPECIES PATHWAY’, ‘PROTEIN SECRETION’ and ‘COMPLEMENT’ were detected as enriched functions (Fig. 5F).

While in this study we explored gene expression variability with aging and sex differences at a population level, it is also important to address gene expression variability at the individual level. As a preliminary approach, we examined gene–gene relationships within the UV RESPONSE DN and HEME METABOLISM gene sets and found that not all gene pairs from these functional groups exhibit a high correlation in expression levels. Due to the complexity of presenting scatter plots for numerous gene pairs, we calculated correlation coefficients as a summary statistic to assess gene expression similarities at the individual level (Supplementary Table S1). The expression level of one gene related to HEME METABOLISM shows a positive correlation with multiple genes in the UV RESPONSE DN set, suggesting its potential role in regulating gene expression across various biological functions. In contrast, the other gene showed a lesser association. These insights underscore the significance of individual gene expression patterns. Individual gene expression data of this study are available in the jMorp database [https://jmorp.megabank.tohoku.ac.jp].

Effect of age and gender on the expression of oxidative and inflammatory stresses related genes

In order gain more insight into the influence of age and gender on the whole blood transcriptome profiles, we next tried to focus on specific pathway of genes. For this purpose, we selected expression profiles of the REACTIVE_OXYGEN_SPECIES_PATHWAY gene set in MSigDB, which include 49 genes (Fig. 6A).

In this analysis, we first noticed that expression of NFE2L2 gene is expressed significantly higher in young females, and the expression is down-regulated with age (Fig. 6B). The NFE2L2 gene encodes NRF2, a transcription factor regulating inducible expression of antioxidant and phase II detoxification-related genes. Expression levels of the NFE2L2 gene in young and elderly males are low, which are comparable with elderly females.

To gain further insight, we compared expression levels of NFE2L2 gene in 14 pregnant females (see  Fig. 1B) with those of 106 non-pregnant females. As a result, we discovered that the NFE2L2 gene expression is significantly higher in pregnant females than in young females (Fig. 6B). As the young females show much higher level of NFE2L2 gene expression than that in elderly females and in both young and elderly males, these results unequivocally demonstrate that the NFE2L2 expression levels in whole blood transcriptome analysis increases significantly with the pregnancy.

We also found that approximately a quarter of the genes (15 out of 49 genes) in this pathway exhibit similar expression level changes to that of NFE2L2 gene, which shows higher level expression only in young females (Fig. 6A). We surmise that this group of genes may include NRF2 target genes. As expected, we found several typical NRF2 target genes including GLRX and TXN genes. Therefore, we performed similar analysis, including the pregnant females. We found that expressions of the GLRX (Fig. 6C) and TXN (Fig. 6D) genes are reproducibly and significantly higher in pregnant young females than in non-pregnant females. Again, both expressions in young females are higher than those in elderly females and both young and elderly males.

To approach to a more comprehensive understanding of our findings related to NRF2, we calculated the correlation coefficients for the gene expression levels between the NRF2 gene and its major downstream genes, including both GLRX and TXN (Supplementary Table S2). As a result, we found that detoxification and antioxidant genes show positive correlation in general, while proinflammatory genes, like TNF-α and IL-6, showed negative correlation. These results show very good coincidence to the previous analyses, as the expressions of the former genes appear to be activated by NRF2, while expressions of the latter genes are shown to be repressed by NRF2 (35).

We found several immune-related genes, such as PDLIM1, show higher level expression in elderly females than in young females, which is an opposite expression change from NFE2L2 (Fig. 6A). PDLIM1 plays an essential role in inflammatory responses, which inhibits NF-κB nuclear translocation through binding to NF-κB. Intriguingly, PDLIM1 gene expression is significantly lower in pregnant females than in non-pregnant females and in elderly females, elderly males and young males (Fig. 6E). These results suggest that anti-inflammation levels might be activated in pregnant females.

These results of the REACTIVE_OXYGEN_SPECIES_PATHWAY gene set thus revealed that expressions of oxidative and inflammatory stress-related response genes in whole blood transcriptome are under elaborate and meticulous regulations (Supplementary Fig. S3), which may have important clinical relevance.

Discussion

In this study, we aimed to prepare a foundational platform of whole blood transcriptome of the Japanese population. For this endeavor, we chose 576 individuals from the participants of the TMM Cohorts and Biobank. The selected group consists of 120 young females and 109 young males, as well as 172 elderly females and 175 elderly males. Notably, our young female group includes 14 pregnant individuals. We conducted whole blood transcriptome analysis on all these stratified groups. By combining RNA sequencing of the globin-including library and in silico removal of globin gene reads, we efficiently captured both the expression dynamics of the globin gene family and accurate expression profiles of all transcripts in whole blood samples. Our results show that a significant number of genes are differentially expressed in both the female and male groups as well as the young and elderly groups. This includes genes found in the reactive oxygen species pathway and other pathways. While recent studies have reported whole blood transcriptome analyses on Japanese subjects (24, 36, 37), to the best of our knowledge, our study is the most extensive and comprehensive whole blood transcriptome data set of the Japanese population to date.

Traditionally, whole blood transcriptome analysis involves the removal of globin gene transcripts along with rRNAs (38). However, we decided to challenge this practice in our study by analyzing the expression of globin gene families. This approach allowed us to gain fundamental insights into the expression levels of each globin gene in adults. Indeed, we found that 7 of the 576 adult subjects analyzed displayed significant levels of expression of the fetal HBG1 or HBG2 genes (encoding Hb Aγ and Gγ, respectively), indicating that the incidence of HPFH in the Japanese population is approximately 1.22%. While the mechanistic basis for HPFH is yet to be clarified, it seems interesting to note that in our long-read sequence analysis of the Japanese population, we found a roughly 5-kb deletion in between the HBG1 and HBG2 genes in 2 of 222 individuals (39). The region has been reported to contain important negative regulatory region for β-type globin gene expression (40, 41).

It is well known that gene expression profiles in human blood samples are influenced by both the age and gender of subjects (1821, 42). Consistent with the findings, our study revealed that a substantial number of genes are differentially expressed between the female and male groups, as well as the young and elderly groups. Intriguingly, we observed that females exhibited more significant age-related changes in gene expression profile (642 genes increased and 687 decreased with age) than males do (increase and decrease with age are 92 and 274, respectively). We hypothesize that these gender difference may impact the regulation of hematopoietic cell differentiation, which becomes more prevalent with age. This contention aligns with a recent study showing that females retain a lower percentage of natural killer cells and a higher percentage of plasma cells in peripheral blood compared to males, but these gender differences diminish with age (43). It has also been reported that some of the hematopoietic transcription factors are influenced by sex hormones. For instance, GATA1 is negatively influenced by estrogen (44).

It should be noted that variations in individual whole blood transcriptome profiles seem to be influenced substantially by the numbers of leukocytes and lymphocytes present in each sample. This observation suggests that the immune system activity of individual appears to be one of the major determinants of gene expression profiles in human whole blood transcriptome analysis. Conversely, this finding also points to the potential of utilizing whole blood transcriptome results for estimating peripheral blood cell composition or deconvoluting peripheral blood cells. With this in mind, we believe that the outcomes of whole blood transcriptome analysis may have extensive applications in the future.

We identified significant enrichments of biological functions or pathways for DEGs associated with age or gender in the whole blood transcriptome analysis. For instance, heme metabolism was noted as a function of genes whose expression was lower in young females in their 20s to 30s. Consistent with this result, concentrations of heme and its related metabolites in serum were found to be higher in males than females (45). In contrast, immune response-related genes exhibited significantly higher expression levels in young females, demonstrating that the immune response activity is more pronounced in young females than in young males. This finding is consistent with previous studies comparing gene expression profiles between females and males (46). Among the genes whose expression levels decline with age solely in males, we identified cell cycle regulation-related functions, such as ‘G2M CHECKPOINT’, ‘E2F TARGETS’ and ‘MYC TARGETS V1’. This presents potentially crucial fundamental information about gender-related differences in cellular senescence and/or cancer cell growth (47). Furthermore, ‘REACTIVE OXYGEN SPECIES PATHWAY’ was detected as an enriched function of genes with higher expression levels in elderly males compared to females. This finding may link the distinct response ability of men and women to oxidative stress with the variations in their rates to suffer from cardiovascular diseases (48).

Our data set includes whole blood transcriptome data taken during pregnancy, providing us with the opportunity to examine the impact of pregnancy on gene expression in whole blood. We observed significant increases in the expression of genes, such as glutaredoxin and thioredoxin, which are associated with the oxidative stress response (49), during pregnancy. Upon detailed scrutiny of the pregnancy-induced changes in gene expression, we identified functions related to immune response and heme metabolism as the functions of genes whose expression increased during pregnancy (Supplementary Fig. S4). Recent studies have shown that the proportion of neutrophils in the blood rises during pregnancy (50). Consequently, our observations may be indicative of the higher expression of genes related to immune function as the neutrophil ratio increases.

It has been shown that both lactotransferrin (LTF) and matrix metallopeptidase 8 (MMP8) are essential for achieving various immune function (51, 52). Our study found that the expression of LTF and MMP8 is much higher in pregnant women in their 30s than in non-pregnant women in their 20s to 30s. Furthermore, we observed an increase in the expression of genes essential for heme and heme protein biosynthesis, such as hydroxymethylbilane synthase (HMBS) and alpha hemoglobin stabilizing protein (AHSP), during pregnancy. Since the genetic polymorphisms of the HMBS gene have been reported to be involved in acute intermittent porphyria (53), our data on the expression levels of these heme biosynthesis-related genes could potentially contribute to a better understanding of the mechanisms underlying iron deficiency anemia during pregnancy.

In summary, our findings offer novel insights into individual gene expression variation within the whole blood transcriptome in the Japanese population. To further promote personalized and precision medicine, it will be essential to clarify the relationship between gene expression variation and genetic variation through eQTL analysis (54).

Supplementary Data

Supplementary data are available at JB Online.

Funding

This research was supported (in part) by the Japan Agency for Medical Research and Development, AMED under grant JP21tm0424601.

Conflict of Interest

None declared.

Author contributions

Y.A. contributed to data analysis and interpretation, drafted the original manuscript. K.T. contributed to blood RNA samples, drafted the original manuscript. H.A., I.M., Ke.K. supervised data analysis. J.K., A.O., L.B., T.S., F.K. supervised interpretation in RNA-seq analysis. N.I., A.H., Ka.K. provided blood samples. K.O. supervised interpretation in hematology. M.Y. wrote the manuscript. All authors reviewed the manuscript and approved the final version of the manuscript to be published.

Acknowledgements

The authors would like to thank Takumi Fukushi, Nanae Osanai, Noriko Takahashi, Keiko Tateno and HaploPharma Inc. (Sendai, Miyagi, Japan) for technical assistance in DNBSEQ-G400 sequencer.

References

1.

Ishigaki
,
A.
,
Higashi
,
H.
,
Sakamoto
,
T.
, and
Shibahara
,
S.
(
2013
)
The great East-Japan earthquake and devastating tsunami: an update and lessons from the past great earthquakes in Japan since 1923
.
Tohoku J. Exp. Med.
 
229
,
287
299

2.

Fuse
,
N.
,
Sakurai-Yageta
,
M.
,
Katsuoka
,
F.
,
Danjoh
,
I.
,
Shimizu
,
R.
,
Tamiya
,
G.
,
Nagami
,
F.
,
Kawame
,
H.
,
Higuchi
,
S.
,
Kinoshita
,
K.
,
Kure
,
S.
, and
Yamamoto
,
M.
(
2019
)
Establishment of integrated biobank for precision medicine and personalized healthcare: the Tohoku medical megabank project
.
JMA J.
 
2
,
113
122

3.

Nagami
,
F.
,
Kuriki
,
M.
,
Koreeda
,
S.
,
Kageyama
,
M.
,
Shimizu
,
O.
,
Toda
,
S.
,
Hozawa
,
A.
,
Kuriyama
,
S.
,
Osumi
,
N.
, and
Yamamoto
,
M.
(
2020
)
Public relations and communication strategies in construction of large-scale cohorts and BioBank: practice in the Tohoku medical megabank project
.
Tohoku J. Exp. Med.
 
250
,
253
262

4.

Kuriyama
,
S.
,
Yaegashi
,
N.
,
Nagami
,
F.
,
Arai
,
T.
,
Kawaguchi
,
Y.
,
Osumi
,
N.
,
Sakaida
,
M.
,
Suzuki
,
Y.
,
Nakayama
,
K.
,
Hashizume
,
H.
,
Tamiya
,
G.
,
Kawame
,
H.
,
Suzuki
,
K.
,
Hozawa
,
A.
,
Nakaya
,
N.
,
Kikuya
,
M.
,
Metoki
,
H.
,
Tsuji
,
I.
,
Fuse
,
N.
,
Kiyomoto
,
H.
,
Sugawara
,
J.
,
Tsuboi
,
A.
,
Egawa
,
S.
,
Ito
,
K.
,
Chida
,
K.
,
Ishii
,
T.
,
Tomita
,
H.
,
Taki
,
Y.
,
Minegishi
,
N.
,
Ishii
,
N.
,
Yasuda
,
J.
,
Igarashi
,
K.
,
Shimizu
,
R.
,
Nagasaki
,
M.
,
Koshiba
,
S.
,
Kinoshita
,
K.
,
Ogishima
,
S.
,
Takai-Igarashi
,
T.
,
Tominaga
,
T.
,
Tanabe
,
O.
,
Ohuchi
,
N.
,
Shimosegawa
,
T.
,
Kure
,
S.
,
Tanaka
,
H.
,
Ito
,
S.
,
Hitomi
,
J.
,
Tanno
,
K.
,
Nakamura
,
M.
,
Ogasawara
,
K.
,
Kobayashi
,
S.
,
Sakata
,
K.
,
Satoh
,
M.
,
Shimizu
,
A.
,
Sasaki
,
M.
,
Endo
,
R.
,
Sobue
,
K.
,
The Tohoku Medical Megabank Project Study Group
, and
Yamamoto
,
M.
(
2016
)
The Tohoku medical megabank project: design and mission
.
J Epidemiol.
 
26
,
493
511

5.

Hozawa
,
A.
,
Tanno
,
K.
,
Nakaya
,
N.
,
Nakamura
,
T.
,
Tsuchiya
,
N.
,
Hirata
,
T.
,
Narita
,
A.
,
Kogure
,
M.
,
Nochioka
,
K.
,
Sasaki
,
R.
,
Takanashi
,
N.
,
Otsuka
,
K.
,
Sakata
,
K.
,
Kuriyama
,
S.
,
Kikuya
,
M.
,
Tanabe
,
O.
,
Sugawara
,
J.
,
Suzuki
,
K.
,
Suzuki
,
Y.
,
Kodama
,
E.N.
,
Fuse
,
N.
,
Kiyomoto
,
H.
,
Tomita
,
H.
,
Uruno
,
A.
,
Hamanaka
,
Y.
,
Metoki
,
H.
,
Ishikuro
,
M.
,
Obara
,
T.
,
Kobayashi
,
T.
,
Kitatani
,
K.
,
Takai-Igarashi
,
T.
,
Ogishima
,
S.
,
Satoh
,
M.
,
Ohmomo
,
H.
,
Tsuboi
,
A.
,
Egawa
,
S.
,
Ishii
,
T.
,
Ito
,
K.
,
Ito
,
S.
,
Taki
,
Y.
,
Minegishi
,
N.
,
Ishii
,
N.
,
Nagasaki
,
M.
,
Igarashi
,
K.
,
Koshiba
,
S.
,
Shimizu
,
R.
,
Tamiya
,
G.
,
Nakayama
,
K.
,
Motohashi
,
H.
,
Yasuda
,
J.
,
Shimizu
,
A.
,
Hachiya
,
T.
,
Shiwa
,
Y.
,
Tominaga
,
T.
,
Tanaka
,
H.
,
Oyama
,
K.
,
Tanaka
,
R.
,
Kawame
,
H.
,
Fukushima
,
A.
,
Ishigaki
,
Y.
,
Tokutomi
,
T.
,
Osumi
,
N.
,
Kobayashi
,
T.
,
Nagami
,
F.
,
Hashizume
,
H.
,
Arai
,
T.
,
Kawaguchi
,
Y.
,
Higuchi
,
S.
,
Sakaida
,
M.
,
Endo
,
R.
,
Nishizuka
,
S.
,
Tsuji
,
I.
,
Hitomi
,
J.
,
Nakamura
,
M.
,
Ogasawara
,
K.
,
Yaegashi
,
N.
,
Kinoshita
,
K.
,
Kure
,
S.
,
Sakai
,
A.
,
Kobayashi
,
S.
,
Sobue
,
K.
,
Sasaki
,
M.
, and
Yamamoto
,
M.
(
2021
)
Study profile of the tohoku medical megabank community-based cohort study
.
J Epidemiol.
 
31
,
65
76

6.

Kuriyama
,
S.
,
Metoki
,
H.
,
Kikuya
,
M.
,
Obara
,
T.
,
Ishikuro
,
M.
,
Yamanaka
,
C.
,
Nagai
,
M.
,
Matsubara
,
H.
,
Kobayashi
,
T.
,
Sugawara
,
J.
,
Tamiya
,
G.
,
Hozawa
,
A.
,
Nakaya
,
N.
,
Tsuchiya
,
N.
,
Nakamura
,
T.
,
Narita
,
A.
,
Kogure
,
M.
,
Hirata
,
T.
,
Tsuji
,
I.
,
Nagami
,
F.
,
Fuse
,
N.
,
Arai
,
T.
,
Kawaguchi
,
Y.
,
Higuchi
,
S.
,
Sakaida
,
M.
,
Suzuki
,
Y.
,
Osumi
,
N.
,
Nakayama
,
K.
,
Ito
,
K.
,
Egawa
,
S.
,
Chida
,
K.
,
Kodama
,
E.
,
Kiyomoto
,
H.
,
Ishii
,
T.
,
Tsuboi
,
A.
,
Tomita
,
H.
,
Taki
,
Y.
,
Kawame
,
H.
,
Suzuki
,
K.
,
Ishii
,
N.
,
Ogishima
,
S.
,
Mizuno
,
S.
,
Takai-Igarashi
,
T.
,
Minegishi
,
N.
,
Yasuda
,
J.
,
Igarashi
,
K.
,
Shimizu
,
R.
,
Nagasaki
,
M.
,
Tanabe
,
O.
,
Koshiba
,
S.
,
Hashizume
,
H.
,
Motohashi
,
H.
,
Tominaga
,
T.
,
Ito
,
S.
,
Tanno
,
K.
,
Sakata
,
K.
,
Shimizu
,
A.
,
Hitomi
,
J.
,
Sasaki
,
M.
,
Kinoshita
,
K.
,
Tanaka
,
H.
,
Kobayashi
,
T.
,
Tohoku Medical Megabank Project Study Group
,
Kure
,
S.
,
Yaegashi
,
N.
, and
Yamamoto
,
M.
(
2020
)
Cohort profile: Tohoku medical megabank project birth and three-generation cohort study (TMM BirThree cohort study): rationale, progress and perspective
.
Int. J. Epidemiol.
 
49
,
18
19M

7.

Minegishi
,
N.
,
Nishijima
,
I.
,
Nobukuni
,
T.
,
Kudo
,
H.
,
Ishida
,
N.
,
Terakawa
,
T.
,
Kumada
,
K.
,
Yamashita
,
R.
,
Katsuoka
,
F.
,
Ogishima
,
S.
,
Suzuki
,
K.
,
Sasaki
,
M.
,
Satoh
,
M.
,
Tohoku Medical Megabank Project Study Group
, and
Yamamoto
,
M.
(
2019
)
Biobank establishment and sample management in the tohoku medical megabank project
.
Tohoku J. Exp. Med.
 
248
,
45
55

8.

Minegishi
,
N.
,
Kumada
,
K.
,
Nobukuni
,
T.
,
Suzuki
,
K.
,
Danjoh
,
I.
,
Nagami
,
F.
,
Tanno
,
K.
,
Ohmomo
,
H.
,
Asahi
,
K.
,
Shimizu
,
A.
,
Hozawa
,
A.
,
Kuriyama
,
S.
,
Tohoku Medical Megabank Project Study Group
,
Fuse
,
N.
,
Tominaga
,
T.
,
Kure
,
S.
,
Yaegashi
,
N.
,
Kinoshita
,
K.
,
Sasaki
,
M.
,
Tanaka
,
H.
, and
Yamamoto
,
M.
(
2021
)
dbTMM: an integrated database of large-scale cohort, genome and clinical data for the Tohoku medical megabank project
.
Hum Genome Var.
 
8
, 44

9.

Yasuda
,
J.
,
Kinoshita
,
K.
,
Katsuoka
,
F.
,
Danjoh
,
I.
,
Sakurai-Yageta
,
M.
,
Motoike
,
I.N.
,
Kuroki
,
Y.
,
Saito
,
S.
,
Kojima
,
K.
,
Shirota
,
M.
,
Saigusa
,
D.
,
Otsuki
,
A.
,
Kawashima
,
J.
,
Yamaguchi-Kabata
,
Y.
,
Tadaka
,
S.
,
Aoki
,
Y.
,
Mimori
,
T.
,
Kumada
,
K.
,
Inoue
,
J.
,
Makino
,
S.
,
Kuriki
,
M.
,
Fuse
,
N.
,
Koshiba
,
S.
,
Tanabe
,
O.
,
Nagasaki
,
M.
,
Tamiya
,
G.
,
Shimizu
,
R.
,
Takai-Igarashi
,
T.
,
Ogishima
,
S.
,
Hozawa
,
A.
,
Kuriyama
,
S.
,
Sugawara
,
J.
,
Tsuboi
,
A.
,
Kiyomoto
,
H.
,
Ishii
,
T.
,
Tomita
,
H.
,
Minegishi
,
N.
,
Suzuki
,
Y.
,
Suzuki
,
K.
,
Kawame
,
H.
,
Tanaka
,
H.
,
Taki
,
Y.
,
Yaegashi
,
N.
,
Kure
,
S.
,
Nagami
,
F.
,
Tohoku Medical Megabank Project Study Group
,
Kosaki
,
K.
,
Sutoh
,
Y.
,
Hachiya
,
T.
,
Shimizu
,
A.
,
Sasaki
,
M.
, and
Yamamoto
,
M.
(
2019
)
Genome analyses for the Tohoku medical megabank project towards establishment of personalized healthcare
.
J. Biochem.
 
165
,
139
158

10.

Sakurai-Yageta
,
M.
,
Kumada
,
K.
,
Gocho
,
C.
,
Makino
,
S.
,
Uruno
,
A.
,
Tadaka
,
S.
,
Motoike
,
I.N.
,
Kimura
,
M.
,
Ito
,
S.
,
Otsuki
,
A.
,
Narita
,
A.
,
Kudo
,
H.
,
Aoki
,
Y.
,
Danjoh
,
I.
,
Yasuda
,
J.
,
Kawame
,
H.
,
Minegishi
,
N.
,
Koshiba
,
S.
,
Fuse
,
N.
,
Tamiya
,
G.
,
Yamamoto
,
M.
, and
Kinoshita
,
K.
(
2021
)
Japonica Array NEO with increased genome-wide coverage and abundant disease risk SNPs
.
J. Biochem.
 
170
,
399
410

11.

Otsuki
,
A.
,
Okamura
,
Y.
,
Aoki
,
Y.
,
Ishida
,
N.
,
Kumada
,
K.
,
Minegishi
,
N.
,
Katsuoka
,
F.
,
Kinoshita
,
K.
, and
Yamamoto
,
M.
(
2021
)
Identification of dominant transcripts in oxidative stress response by a full-length transcriptome analysis
.
Mol. Cell. Biol.
 
41
,
e00472
e00420

12.

Hachiya
,
T.
,
Furukawa
,
R.
,
Shiwa
,
Y.
,
Ohmomo
,
H.
,
Ono
,
K.
,
Katsuoka
,
F.
,
Nagasaki
,
M.
,
Yasuda
,
J.
,
Fuse
,
N.
,
Kinoshita
,
K.
,
Yamamoto
,
M.
,
Tanno
,
K.
,
Satoh
,
M.
,
Endo
,
R.
,
Sasaki
,
M.
,
Sakata
,
K.
,
Kobayashi
,
S.
,
Ogasawara
,
K.
,
Hitomi
,
J.
,
Sobue
,
K.
, and
Shimizu
,
A.
(
2017
)
Genome-wide identification of inter-individually variable DNA methylation sites improves the efficacy of epigenetic association studies
.
NPJ Genom Med.
 
2
, 11

13.

Koshiba
,
S.
,
Motoike
,
I.
,
Saigusa
,
D.
,
Inoue
,
J.
,
Shirota
,
M.
,
Katoh
,
Y.
,
Katsuoka
,
F.
,
Danjoh
,
I.
,
Hozawa
,
A.
,
Kuriyama
,
S.
,
Minegishi
,
N.
,
Nagasaki
,
M.
,
Takai-Igarashi
,
T.
,
Ogishima
,
S.
,
Fuse
,
N.
,
Kure
,
S.
,
Tamiya
,
G.
,
Tanabe
,
O.
,
Yasuda
,
J.
,
Kinoshita
,
K.
, and
Yamamoto
,
M.
(
2018
)
Omics research project on prospective cohort studies from the Tohoku Medical Megabank Project
.
Genes Cells
 
23
,
406
417

14.

Saito
,
S.
,
Aoki
,
Y.
,
Tamahara
,
T.
,
Goto
,
M.
,
Matsui
,
H.
,
Kawashima
,
J.
,
Danjoh
,
I.
,
Hozawa
,
A.
,
Kuriyama
,
S.
,
Suzuki
,
Y.
,
Fuse
,
N.
,
Kure
,
S.
,
Yamashita
,
R.
,
Tanabe
,
O.
,
Minegishi
,
N.
,
Kinoshita
,
K.
,
Tsuboi
,
A.
,
Shimizu
,
R.
, and
Yamamoto
,
M.
(
2021
)
Oral microbiome analysis in prospective genome cohort studies of the Tohoku medical megabank project
.
Front. Cell. Infect. Microbiol.
 
10
, 604596

15.

Tadaka
,
S.
,
Hishinuma
,
E.
,
Komaki
,
S.
,
Motoike
,
I.N.
,
Kawashima
,
J.
,
Saigusa
,
D.
,
Inoue
,
J.
,
Takayama
,
J.
,
Okamura
,
Y.
,
Aoki
,
Y.
,
Shirota
,
M.
,
Otsuki
,
A.
,
Katsuoka
,
F.
,
Shimizu
,
A.
,
Tamiya
,
G.
,
Koshiba
,
S.
,
Sasaki
,
M.
,
Yamamoto
,
M.
, and
Kinoshita
,
K.
(
2021
)
jMorp updates in 2020: large enhancement of multi-omics data resources on the general Japanese population
.
Nucleic Acids Res.
 
49
,
D536
D544

16.

Byron
,
S.A.
,
Van Keuren-Jensen
,
K.R.
,
Engelthaler
,
D.M.
,
Carpten
,
J.D.
, and
Craig
,
D.W.
(
2016
)
Translating RNA sequencing into clinical diagnostics: Opportunities and challenges
.
Nat Rev Genet
 
17
,
257
271

17.

Koks
,
G.
,
Pfaff
,
A.L.
,
Bubb
,
V.J.
,
Quinn
,
J.P.
, and
Koks
,
S.
(
2021
)
At the dawn of the transcriptomic medicine
.
Exp Biol Med (Maywood)
 
246
,
286
292

18.

Peters
,
M.J.
,
Joehanes
,
R.
,
Pilling
,
L.C.
,
Schurmann
,
C.
,
Conneely
,
K.N.
,
Powell
,
J.
,
Reinmaa
,
E.
,
Sutphin
,
G.L.
,
Zhernakova
,
A.
,
Schramm
,
K.
,
Wilson
,
Y.A.
,
Kobes
,
S.
,
Tukiainen
,
T.
,
NABEC/UKBEC Consortium
,
Ramos
,
Y.F.
,
Göring
,
H.H.H.
,
Fornage
,
M.
,
Liu
,
Y.
,
Gharib
,
S.A.
,
Stranger
,
B.E.
,
De Jager
,
Aviv
,
A.
,
Levy
,
D.
,
Murabito
,
J.M.
,
Munson
,
P.J.
,
Huan
,
T.
,
Hofman
,
A.
,
Uitterlinden
,
A.G.
,
Rivadeneira
,
F.
,
van
 
Rooij
,
Stolk
,
L.
,
Broer
,
L.
,
Verbiest
,
M.M.P.J.
,
Jhamai
,
M.
,
Arp
,
P.
,
Metspalu
,
A.
,
Tserel
,
L.
,
Milani
,
L.
,
Samani
,
N.J.
,
Peterson
,
P.
,
Kasela
,
S.
,
Codd
,
V.
,
Peters
,
A.
,
Ward-Caviness
,
C.K.
,
Herder
,
C.
,
Waldenberger
,
M.
,
Roden
,
M.
,
Singmann
,
P.
,
Zeilinger
,
S.
,
Illig
,
T.
,
Homuth
,
G.
,
Grabe
,
H.-J.
,
Völzke
,
H.
,
Steil
,
L.
,
Kocher
,
T.
,
Murray
,
A.
,
Melzer
,
D.
,
Yaghootkar
,
H.
,
Bandinelli
,
S.
,
Moses
,
E.K.
,
Kent
,
J.W.
,
Curran
,
J.E.
,
Johnson
,
M.P.
,
Williams-Blangero
,
S.
,
Westra
,
H.-J.
,
McRae
,
A.F.
,
Smith
,
J.A.
,
Kardia
,
S.L.R.
,
Hovatta
,
I.
,
Perola
,
M.
,
Ripatti
,
S.
,
Salomaa
,
V.
,
Henders
,
A.K.
,
Martin
,
N.G.
,
Smith
,
A.K.
,
Mehta
,
D.
,
Binder
,
E.B.
,
Nylocks
,
K.M.
,
Kennedy
,
E.M.
,
Klengel
,
T.
,
Ding
,
J.
,
Suchy-Dicey
,
A.M.
,
Enquobahrie
,
D.A.
,
Brody
,
J.
,
Rotter
,
J.I.
,
Chen
,
Y.-D.I.
,
Houwing-Duistermaat
,
J.
,
Kloppenburg
,
M.
,
Slagboom
,
P.E.
,
Helmer
,
Q.
,
den
 
Hollander
,
Bean
,
S.
,
Raj
,
T.
,
Bakhshi
,
N.
,
Wang
,
Q.P.
,
Oyston
,
L.J.
,
Psaty
,
B.M.
,
Tracy
,
R.P.
,
Montgomery
,
G.W.
,
Turner
,
S.T.
,
Blangero
,
J.
,
Meulenbelt
,
I.
,
Ressler
,
K.J.
,
Yang
,
J.
,
Franke
,
L.
,
Kettunen
,
J.
,
Visscher
,
P.M.
,
Neely
,
G.G.
,
Korstanje
,
R.
,
Hanson
,
R.L.
,
Prokisch
,
H.
,
Ferrucci
,
L.
,
Esko
,
T.
,
Teumer
,
A.
,
van
 
Meurs
, and
Johnson
,
A.D.
(
2015
)
The transcriptional landscape of age in human peripheral blood
.
Nat. Commun.
 
6
, 8570

19.

Jansen
,
R.
,
Batista
,
S.
,
Brooks
,
A.I.
,
Tischfield
,
J.A.
,
Willemsen
,
G.
,
van
 
Grootheest
,
G.
,
Hottenga
,
J.-J.
,
Milaneschi
,
Y.
,
Mbarek
,
H.
,
Madar
,
V.
,
Peyrot
,
W.
,
Vink
,
J.M.
,
Verweij
,
C.L.
,
de
 
Geus
,
Smit
,
J.H.
,
Wright
,
F.A.
,
Sullivan
,
P.F.
,
Boomsma
,
D.I.
, and
Penninx
,
B.W.J.H.
(
2014
,
2014
)
Sex differences in the human peripheral blood transcriptome
.
BMC Genomics
 
15
, 33

20.

Schmidt
,
M.
,
Hopp
,
L.
,
Arakelyan
,
A.
,
Kirsten
,
H.
,
Engel
,
C.
,
Wirkner
,
K.
,
Krohn
,
K.
,
Burkhardt
,
R.
,
Thiery
,
J.
,
Loeffler
,
M.
,
Loeffler-Wirth
,
H.
, and
Binder
,
H.
(
2020
)
The human blood transcriptome in a large population cohort and its relation to aging and health
.
Front Big Data.
 
3
, 548873

21.

de
 
Almeida Chuffa
,
L.G.
,
Freire
,
P.P.
,
dos
 
Santos
,
S.J.
,
de
 
Mello
,
M.C.
,
de
 
Oliveira
,
N.M.
, and
Carvalho
,
R.F.
(
2022
)
Aging whole blood transcriptome reveals candidate genes for SARS-CoV-2-related vascular and immune alterations
.
J. Mol. Med.
 
100
,
285
301

22.

The GTEx Consortium atlas of genetic regulatory effects across human tissues

The GTEx Consortium
.
Available from:
 www.gtexportal.org

23.

Basu
,
M.
,
Wang
,
K.
,
Ruppin
,
E.
, and
Hannenhalli
,
S.
(
2021
)
Predicting tissue-specific gene expression from whole blood transcriptome
.
Sci. Adv.
 
7
,
7
, eabd6991

24.

Shimizu
,
T.
,
Kozu
,
Y.
,
Hiranuma
,
H.
,
Gon
,
Y.
,
Izumi
,
N.
,
Nagata
,
K.
,
Ueda
,
K.
,
Taki
,
R.
,
Hanada
,
S.
,
Kawamura
,
K.
,
Ichikado
,
K.
,
Nishiyama
,
K.
,
Muranaka
,
H.
,
Nakamura
,
K.
,
Hashimoto
,
N.
,
Wakahara
,
K.
,
Koji
,
S.
,
Omote
,
N.
,
Ando
,
A.
,
Kodama
,
N.
,
Kaneyama
,
Y.
,
Maeda
,
S.
,
Kuraki
,
T.
,
Matsumoto
,
T.
,
Yokote
,
K.
,
Nakada
,
T.-A.
,
Abe
,
R.
,
Oshima
,
T.
,
Shimada
,
T.
,
Harada
,
M.
,
Takahashi
,
T.
,
Ono
,
H.
,
Sakurai
,
T.
,
Shibusawa
,
T.
,
Kimizuka
,
Y.
,
Kawana
,
A.
,
Sano
,
T.
,
Watanabe
,
C.
,
Suematsu
,
R.
,
Sageshima
,
H.
,
Yoshifuji
,
A.
,
Ito
,
K.
,
Takahashi
,
S.
,
Ishioka
,
K.
,
Nakamura
,
M.
,
Masuda
,
M.
,
Wakabayashi
,
A.
,
Watanabe
,
H.
,
Ueda
,
S.
,
Nishikawa
,
M.
,
Chihara
,
Y.
,
Takeuchi
,
M.
,
Onoi
,
K.
,
Shinozuka
,
J.
,
Sueyoshi
,
A.
,
Nagasaki
,
Y.
,
Okamoto
,
M.
,
Ishihara
,
S.
,
Shimo
,
M.
,
Tokunaga
,
Y.
,
Kusaka
,
Y.
,
Ohba
,
T.
,
Isogai
,
S.
,
Ogawa
,
A.
,
Inoue
,
T.
,
Fukuyama
,
S.
,
Eriguchi
,
Y.
,
Yonekawa
,
A.
,
Kan-O
,
K.
,
Matsumoto
,
K.
,
Kanaoka
,
K.
,
Ihara
,
S.
,
Komuta
,
K.
,
Inoue
,
Y.
,
Chiba
,
S.
,
Yamagata
,
K.
,
Hiramatsu
,
Y.
,
Kai
,
H.
,
Asano
,
K.
,
Oguma
,
T.
,
Ito
,
Y.
,
Hashimoto
,
S.
,
Yamasaki
,
M.
,
Kasamatsu
,
Y.
,
Komase
,
Y.
,
Hida
,
N.
,
Tsuburai
,
T.
,
Oyama
,
B.
,
Takada
,
M.
,
Kanda
,
H.
,
Kitagawa
,
Y.
,
Fukuta
,
T.
,
Miyake
,
T.
,
Yoshida
,
S.
,
Ogura
,
S.
,
Abe
,
S.
,
Kono
,
Y.
,
Togashi
,
Y.
,
Takoi
,
H.
,
Kikuchi
,
R.
,
Ogawa
,
S.
,
Ogata
,
T.
,
Ishihara
,
S.
,
Kanehiro
,
A.
,
Ozaki
,
S.
,
Fuchimoto
,
Y.
,
Wada
,
S.
,
Fujimoto
,
N.
,
Nishiyama
,
K.
,
Terashima
,
M.
,
Yoshida
,
S.B. .K.
,
Narumoto
,
O.
,
Nagai
,
H.
,
Ooshima
,
N.
,
Motegi
,
M.
,
Umeda
,
A.
,
Miyagawa
,
K.
,
Shimada
,
H.
,
Endo
,
M.
,
Ohira
,
Y.
,
Watanabe
,
M.
,
Inoue
,
S.
,
Igarashi
,
A.
,
Sato
,
M.
,
Sagara
,
H.
,
Tanaka
,
A.
,
Ohta
,
S.
,
Kimura
,
T.
,
Shibata
,
Y.
,
Tanino
,
Y.
,
Nikaido
,
T.
,
Minemura
,
H.
,
Sato
,
Y.
,
Yamada
,
Y.
,
Hashino
,
T.
,
Shinoki
,
M.
,
Iwagoe
,
H.
,
Takahashi
,
H.
,
Fujii
,
K.
,
Kishi
,
H.
,
Kanai
,
M.
,
Imamura
,
T.
,
Yamashita
,
T.
,
Yatomi
,
M.
,
Maeno
,
T.
,
Hayashi
,
S.
,
Takahashi
,
M.
,
Kuramochi
,
M.
,
Kamimaki
,
I.
,
Tominaga
,
Y.
,
Ishii
,
T.
,
Utsugi
,
M.
,
Ono
,
A.
,
Tanaka
,
T.
,
Kashiwada
,
T.
,
Fujita
,
K.
,
Saito
,
Y.
,
Seike
,
M.
,
Watanabe
,
H.
,
Matsuse
,
H.
,
Kodaka
,
N.
,
Nakano
,
C.
,
Oshio
,
T.
,
Hirouchi
,
T.
,
Makino
,
S.
,
Egi
,
M.
,
Omae
,
Y.
,
Nannya
,
Y.
,
Ueno
,
T.
,
Katayama
,
T.T. .K.
,
Ai
,
M.
,
Kumanogoh
,
A.
,
Sato
,
T.
,
Hasegawa
,
N.
,
Tokunaga
,
K.
,
Ishii
,
M.
,
Koike
,
R.
,
Kitagawa
,
Y.
,
Kimura
,
A.
,
Imoto
,
S.
,
Miyano
,
S.
,
Ogawa
,
S.
,
Kanai
,
T.
,
Fukunaga
,
K.
, and
Okada
,
Y.
(
2022
)
The whole blood transcriptional regulation landscape in 465 COVID-19 infected samples from Japan COVID-19 task force
.
Nat. Commun.
 
13
, 4830

25.

Harrington
,
C.A.
,
Fei
,
S.S.
,
Minnier
,
J.
,
Carbone
,
L.
,
Searles
,
R.
,
Davis
,
B.A.
,
Ogle
,
K.
,
Planck
,
S.R.
,
Rosenbaum
,
J.T.
, and
Choi
,
D.
(
2020
)
RNA-Seq of human whole blood: evaluation of globin RNA depletion on Ribo-zero library method
.
Sci. Rep.
 
10
, 6271

26.

Bolger
,
A.M.
,
Lohse
,
M.
, and
Usadel
,
B.
(
2014
)
Trimmomatic: a flexible trimmer for Illumina sequence data
.
Bioinformatics
 
30
,
2114
2120

27.

Dobin
,
A.
,
Davis
,
C.A.
,
Schlesinger
,
F.
,
Drenkow
,
J.
,
Zaleski
,
C.
,
Jha
,
S.
,
Batut
,
P.
,
Chaisson
,
M.
, and
Gingeras
,
T.R.
(
2013
)
STAR: ultrafast universal RNA-seq aligner
.
Bioinformatics
 
29
,
15
21

28.

Frankish
,
A.
,
Diekhans
,
M.
,
Jungreis
,
I.
,
Lagarde
,
J.
,
Loveland
,
J.E.
,
Mudge
,
J.M.
,
Sisu
,
C.
,
Wright
,
J.C.
,
Armstrong
,
J.
,
Barnes
,
I.
,
Berry
,
A.
,
Bignell
,
A.
,
Boix
,
C.
,
Sala
,
S.C.
,
Cunningham
,
F.
,
Di Domenico
,
Donaldson
,
S.
,
Fiddes
,
I.T.
,
Girón
,
C.G.
,
Gonzalez
,
J.M.
,
Grego
,
T.
,
Hardy
,
M.
,
Hourlier
,
T.
,
Howe
,
K.L.
,
Hunt
,
T.
,
Izuogu
,
O.G.
,
Johnson
,
R.
,
Martin
,
F.J.
,
Martínez
,
L.
,
Mohanan
,
S.
,
Muir
,
P.
,
Navarro
,
F.C.P.
,
Parker
,
A.
,
Pei
,
B.
,
Pozo
,
F.
,
Riera
,
F.C.
,
Ruffier
,
M.
,
Schmitt
,
B.M.
,
Stapleton
,
E.
,
Suner
,
M.-M.
,
Sycheva
,
I.
,
Uszczynska-Ratajczak
,
B.
,
Wolf
,
M.Y.
,
Xu
,
J.
,
Yang
,
Y.T.
,
Yates
,
A.
,
Zerbino
,
D.
,
Zhang
,
Y.
,
Choudhary
,
J.S.
,
Gerstein
,
M.
,
Guigó
,
R.
,
Hubbard
,
T.J.P.
,
Kellis
,
M.
,
Paten
,
B.
,
Tress
,
M.L.
, and
Flicek
,
P.
(
2021
)
GENCODE 2021
.
Nucleic Acids Res.
 
49
,
D916
D923

29.

Wagner
,
G.P.
,
Kin
,
K.
, and
Lynch
,
V.J.
(
2012
)
Measurement of mRNA abundance using RNA-seq data: RPKM measure is inconsistent among samples
.
Theory Biosci.
 
131
,
281
285

30.

Li
,
B.
and
Dewey
,
C.N.
(
2011
)
RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome
.
BMC Bioinformatics.
 
4
, 323

31.

Love
,
M.I.
,
Huber
,
W.
, and
Anders
,
S.
(
2014
)
Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2
.
Genome Biol.
 
15
, 550

32.

Qi
,
D.
,
Geng
,
Y.
,
Cardenas
,
J.
,
Gu
,
J.
,
Yi
,
S.S.
,
Huang
,
J.H.
,
Fonkem
,
E.
, and
Wu
,
E.
(
2023
)
Transcriptomic analyses of patient peripheral blood with hemoglobin depletion reveal glioblastoma biomarkers
.
NPJ Genom Med.
 
8
, 2

33.

Mastrokolias
,
A.
,
den
 
Dunnen
,
J.T.
,
van
 
Ommen
,
G.J.B.
,
‘t Hoen
,
P.A.C.
, and
van
 
Roon-Mom
,
W.M.C.
(
2012
)
Increased sensitivity of next generation sequencing-based expression profiling after globin reduction in human blood RNA
.
BMC Genomics
 
13
, 28

34.

Klos
,
A.
,
Tenner
,
A.J.
,
Johswich
,
K.O.
,
Ager
,
R.R.
,
Reis
,
E.S.
, and
Köhl
,
J.
(
2009
)
The role of the anaphylatoxins in health and disease
.
Mol. Immunol.
 
46
,
2753
2766

35.

Kobayashi
,
E.H.
,
Suzuki
,
T.
,
Funayama
,
R.
,
Nagashima
,
T.
,
Hayashi
,
M.
,
Sekine
,
H.
,
Tanaka
,
N.
,
Moriguchi
,
T.
,
Motohashi
,
H.
,
Nakayama
,
K.
, and
Yamamoto
,
M.
(
2016
)
Nrf2 suppresses macrophage inflammatory response by blocking proinflammatory cytokine transcription
.
Nat. Commun.
 
7
, 11624

36.

Togami
,
Y.
,
Matsumoto
,
H.
,
Yoshimura
,
J.
,
Matsubara
,
T.
,
Ebihara
,
T.
,
Matsuura
,
H.
,
Mitsuyama
,
Y.
,
Kojima
,
T.
,
Ishikawa
,
M.
,
Sugihara
,
F.
,
Hirata
,
H.
,
Okuzaki
,
D.
, and
Ogura
,
H.
(
2022
)
Significance of interferon signaling based on mRNA-microRNA integration and plasma protein analyses in critically ill COVID-19 patients
.
Mol Ther Nucleic Acids.
 
13
,
343
353

37.

Shinozaki
,
F.
,
Kamei
,
A.
,
Shimada
,
K.
,
Matsuura
,
H.
,
Shibata
,
T.
,
Ikeuchi
,
M.
,
Yasuda
,
K.
,
Oroguchi
,
T.
,
Kishimoto
,
N.
,
Takashimizu
,
S.
,
Nishizaki
,
Y.
, and
Abe
,
K.
(
2023
)
Ingestion of taxifolin-rich foods affects brain activity, mental fatigue, and the whole blood transcriptome in healthy young adults: a randomized, double-blind, placebo-controlled, crossover study
.
Food Funct.
 
14
,
3600
3612

38.

Shin
,
H.
,
Shannon
,
C.P.
,
Fishbane
,
N.
,
Ruan
,
J.
,
Zhou
,
M.
,
Balshaw
,
R.
,
Wilson-McManus
,
J.E.
,
Ng
,
R.T.
,
McManus
,
B.M.
,
Tebbutt
,
S.J.
, and
PROOF Centre of Excellence Team
(
2014
)
Variation in RNA-Seq transcriptome profiles of peripheral whole blood from healthy individuals with and without globin depletion
.
PLoS One
 
9
, e91041

39.

Otsuki
,
A.
,
Okamura
,
Y.
,
Ishida
,
N.
,
Tadaka
,
S.
,
Takayama
,
J.
,
Kumada
,
K.
,
Kawashima
,
J.
,
Taguchi
,
K.
,
Minegishi
,
N.
,
Kuriyama
,
S.
,
Tamiya
,
G.
,
Kinoshita
,
K.
,
Katsuoka
,
F.
, and
Yamamoto
,
M.
(
2022
)
Construction of a trio-based structural variation panel utilizing activated T lymphocytes and long-read sequencing technology
.
Commun Biol.
 
5
, 991

40.

Lee
,
S.T.
,
Yoo
,
E.H.
,
Kim
,
J.Y.
,
Kim
,
J.W.
, and
Ki
,
C.S.
(
2010
)
Multiplex ligation-dependent probe amplification screening of isolated increased HbF levels revealed three cases of novel rearrangements/deletions in the β-globin gene cluster
.
Br. J. Haematol.
 
148
,
154
160

41.

Shen
,
Y.
,
Verboon
,
J.M.
,
Zhang
,
Y.
,
Liu
,
N.
,
Kim
,
Y.J.
,
Marglous
,
S.
,
Nandakumar
,
S.K.
,
Voit
,
R.A.
,
Fiorini
,
C.
,
Ejaz
,
A.
,
Basak
,
A.
,
Orkin
,
S.H.
,
Xu
,
J.
, and
Sankaran
,
V.G.
(
2021
)
A unified model of human hemoglobin switching through single-cell genome editing
.
Nat. Commun.
 
12
, 4991

42.

Aguet
,
F.
,
Barbeira
,
A.N.
,
Bonazzola
,
R.
,
Brown
,
A.
,
Castel
,
S.E.
,
Jo
,
B.
 et al. (
2020
)
The impact of sex on gene expression across human tissues
.
Science (1979)
 
369
,

43.

Huang
,
Z.
,
Chen
,
B.
,
Liu
,
X.
,
Li
,
H.
,
Xie
,
L.
,
Gao
,
Y.
,
Duan
,
R.
,
Li
,
Z.
,
Zhang
,
J.
,
Zheng
,
Y.
, and
Su
,
W.
(
2021
)
Effects of sex and aging on the immune cell landscape as assessed by single-cell transcriptomic analysis
.
Proc. Natl. Acad. Sci. U. S. A.
 
118
, e2023216118

44.

Blobel
,
G.A.
,
Sieff
,
C.A.
, and
Orkin
,
S.H.
(
1995
)
Ligand-dependent repression of the erythroid transcription factor GATA-1 by the estrogen receptor
.
Mol. Cell. Biol.
 
15
,
3147
3153

45.

Krumsiek
,
J.
,
Mittelstrass
,
K.
,
Do
,
K.T.
,
Stückler
,
F.
,
Ried
,
J.
,
Adamski
,
J.
,
Peters
,
A.
,
Illig
,
T.
,
Kronenberg
,
F.
,
Friedrich
,
N.
,
Nauck
,
M.
,
Pietzner
,
M.
,
Mook-Kanamori
,
D.O.
,
Suhre
,
K.
,
Gieger
,
C.
,
Grallert
,
H.
,
Theis
,
F.J.
, and
Kastenmüller
,
G.
(
2015
)
Gender-specific pathway differences in the human serum metabolome
.
Metabolomics
 
11
,
1815
1833

46.

Oertelt-Prigione
,
S.
(
2012
)
The influence of sex and gender on the immune response
.
Autoimmun. Rev.
 
11
,
A479
A485

47.

Ng
,
M.
and
Hazrati
,
L.N.
(
2022
)
Evidence of sex differences in cellular senescence
.
Neurobiol. Aging
 
120
,
88
104

48.

Kander
,
M.C.
,
Cui
,
Y.
, and
Liu
,
Z.
(
2017
)
Gender difference in oxidative stress: a new look at the mechanisms for cardiovascular diseases
.
J. Cell. Mol. Med.
 
21
,
1024
1032

49.

Holmgren
,
A.
(
2000
)
Antioxidant function of Thioredoxin and Glutaredoxin systems
.
Antioxid. Redox Signal.
 
2
,
811
820

50.

Wright
,
M.L.
,
Goin
,
D.E.
,
Smed
,
M.K.
,
Jewell
,
N.P.
,
Nelson
,
J.L.
,
Olsen
,
J.
,
Hetland
,
M.L.
,
Zoffmann
,
V.
, and
Jawaheer
,
D.
(
2023
)
Pregnancy-associated systemic gene expression compared to a pre-pregnancy baseline, among healthy women with term pregnancies
.
Front. Immunol.
 
14
, 1161084

51.

Kruzel
,
M.L.
,
Zimecki
,
M.
, and
Actor
,
J.K.
(
2017
)
Lactoferrin in a context of inflammation-induced pathology
.
Front. Immunol.
 
8
, 1438

52.

Lee
,
E.J.
,
Han
,
J.E.
,
Woo
,
M.S.
,
Shin
,
J.A.
,
Park
,
E.M.
,
Kang
,
J.L.
,
Moon
,
P.G.
,
Baek
,
M.-C.
,
Son
,
W.-S.
,
Ko
,
Y.T.
,
Choi
,
J.W.
, and
Kim
,
H.-S.
(
2014
)
Matrix Metalloproteinase-8 plays a pivotal role in Neuroinflammation by modulating TNF-α activation
.
J. Immunol.
 
193
,
2384
2393

53.

Zheng
,
Y.
,
Xu
,
J.
,
Liang
,
S.
,
Lin
,
D.
, and
Banerjee
,
S.
(
2018
)
Whole exome sequencing identified a novel heterozygous mutation in HMBS gene in a Chinese patient with acute intermittent porphyria with rare type of mild anemia
.
Front. Genet.
 
9
, 129

54.

Gibson
,
G.
,
Powell
,
J.E.
, and
Marigorta
,
U.M.
(
2015
)
Expression quantitative trait locus analysis for translational medicine
.
Genome Med
 
7
, 60

Author notes

co-first author

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected]

Supplementary data