-
PDF
- Split View
-
Views
-
Cite
Cite
Ying Zhang, Matthew Kim, Michael Prerau, Daniel Mobley, Michael Rueschman, Kathryn Sparks, Meg Tully, Shaun Purcell, Susan Redline, The National Sleep Research Resource: making data findable, accessible, interoperable, reusable and promoting sleep science, Sleep, Volume 47, Issue 7, July 2024, zsae088, https://doi.org/10.1093/sleep/zsae088
- Share Icon Share
Abstract
This paper presents a comprehensive overview of the National Sleep Research Resource (NSRR), a National Heart Lung and Blood Institute-supported repository developed to share data from clinical studies focused on the evaluation of sleep disorders. The NSRR addresses challenges presented by the heterogeneity of sleep-related data, leveraging innovative strategies to optimize the quality and accessibility of available datasets. It provides authorized users with secure centralized access to a large quantity of sleep-related data including polysomnography, actigraphy, demographics, patient-reported outcomes, and other data. In developing the NSRR, we have implemented data processing protocols that ensure de-identification and compliance with FAIR (Findable, Accessible, Interoperable, Reusable) principles. Heterogeneity stemming from intrinsic variation in the collection, annotation, definition, and interpretation of data has proven to be one of the primary obstacles to efficient sharing of datasets. Approaches employed by the NSRR to address this heterogeneity include (1) development of standardized sleep terminologies utilizing a compositional coding scheme, (2) specification of comprehensive metadata, (3) harmonization of commonly used variables, and (3) computational tools developed to standardize signal processing. We have also leveraged external resources to engineer a domain-specific approach to data harmonization. We describe the scope of data within the NSRR, its role in promoting sleep and circadian research through data sharing, and harmonization of large datasets and analytical tools. Finally, we identify opportunities for approaches for the field of sleep medicine to further support data standardization and sharing.

This manuscript introduces the National Sleep Research Resource (NSRR), a pioneering repository providing access to a large quantity of diverse sleep-related data, crucial for understanding sleep disorders and their systemic impacts on health and health disparities. Adhering to FAIR principles, the NSRR addresses data heterogeneity in sleep data by standardizing sleep terminologies, specifying comprehensive metadata standards, harmonizing commonly used data, and developing computational tools to standardize signal processing. This platform is vital for bridging knowledge gaps in sleep research, promoting innovative data analysis, and enabling translational research. Its approach can inform data sharing, metadata, and Common Data Element development in other domains, significantly enhancing scientific discovery and productivity, statistical power, rigor, and reproducibility in sleep and circadian science.
The generation of massive volumes of biomedical data from multiple sources in combination with a need for greater rigor and reproducibility of scientific research findings has spurred efforts to promote data sharing and standardization to create “big data” resources. Over the last 25 years, the National Institutes of Health (NIH) invested in various initiatives to support these goals, including, but not limited to: (1) the creation of over 130 domain-specific biomedical data repositories and knowledgebases; (2) the Big Data 2 Knowledge initiative that supported the development of tools and training in big data analytics [1, 2]; (3) cloud-based “ecosystems” to store, access, and analyze data, such as BioData Catalyst [3]; and most recently, (4) the NIH Data Management and Sharing requirement for NIH grantees to propose formal plans for standardizing and sharing newly generated research data [4]. Much effort was focused on areas such as cancer and imaging where opportunities were identified to apply machine learning for improved diagnostic and prognostic tools; genomics, which requires extremely large sample sizes to detect typically small effects; and electronic health records, which are continuously generated for tens of millions of people, resulting in huge amounts of “real world” clinical data that are highly under-utilized.
Sleep and circadian data also present unique big data opportunities due to the fundamental role of sleep and circadian rhythms in nearly all physiological systems, as well as the richness of sleep and circadian datasets that include data on multiple physiological systems measured in temporally precise patterns over hours, if not days. The value of repositories and tools for accessing and analyzing physiological signals such as those obtained by electrocardiography and electroencephalography was recognized as early as 1999 when the NIH invested in the PhysioNet Research Resource for Complex Physiologic Signals. Its aim was to create “archives of digital recordings of a wide variety of physiologic signals and related data and associated tools from healthy subjects and patients with a variety of conditions.” However, until 2013 when the National Sleep Research Resource (NSRR; sleepdata.org) was launched, there were no repositories that specifically focused on sleep-related data and the needs of the sleep and circadian research communities. Over its 10-year history, while continually ingesting new datasets, the NSRR has iteratively developed approaches for improving data representation and processes for improving the accessibility and quality of annotated sleep-related summary and raw signal data. Some approaches address problems that are readily applicable to all data types, while others reflect unique aspects of sleep data. In this paper, we: (1) summarize the potential for sleep and circadian data to accelerate scientific discovery and general challenges; (2) provide an overview of the NSRR and a sample of its data; (3) describe specific challenges that impact data standardization and harmonization; (4) describe approaches for developing study-specific and variable-specific metadata and the use of signal processing tools to address FAIR principles and facilitate data harmonization; and (5) propose future directions. We hope that this paper will increase awareness of the value of sleep data repositories generally, as well as improve the understanding of the organization and content of the NSRR specifically, inform future data collection and annotation procedures to facilitate data harmonization, and better prepare sleep researchers to meet current NIH data sharing requirements.
Untapped potential of sleep and circadian data: motivations and goals of the NSRR
Opportunities.
Robust sleep and circadian data repositories could propel multiple scientific discoveries, enhancing the understanding of numerous complex physiological systems while filling critical knowledge gaps related to sleep disorders and their underlying population distributions, risk factors, etiological mechanisms, and impact on health and health disparities. Sleep disorders are prevalent, widely under-recognized and under-treated, and associated with significant morbidity and mortality patterns that are incompletely understood [5]. There are therefore numerous research questions that require access to sleep data from large, well-characterized, and diverse samples connected to clinical and outcome data. Notably, sleep research provides opportunities to understand multiple physiological processes, disease mechanisms, and health outcomes. For example, sleep traits are genetically correlated with multiple cardiovascular, metabolic, and hematological traits [6], providing opportunities to study shared genetic mechanisms to uncover potentially novel etiological pathways and inter-relationships underlying common chronic diseases. Sleep and circadian data provide a unique window into the dynamics and interactions of multiple physiological processes and systems. For example, the neurophysiological manifestations of sleep as measured by sleep macro-architecture (e.g. stages) and sleep micro-architecture (e.g. sleep spindle activity) change dynamically over very short time scales and provide windows into multiple brain-peripheral physiological interactions [7]. Additionally, the occurrences of sleep-related physiological events such as apneas, arousals, cardiac arrhythmias, periodic limb movements, and seizures occur in temporally complex and informative patterns, reflecting the influences of variations in sleep state, body position, circadian phase, autonomic function, and prior physiological events [8–10]. Analyses of streams of diverse data provide opportunities to discern the “cross-talk” across multiple physiological systems and to develop temporally-based interventions that anticipate and potentially prevent adverse physiological events. High dimensional sleep data are ripe for using artificial intelligence and machine learning for developing algorithms that could transform the clinical management of patients with sleep disorders, but require interrogation of large and diverse datasets [11].
General challenges:
A major barrier to pursuing the many exciting research opportunities of sleep and circadian science relates to the limitations of individual datasets that often lack diversity (socio-economic, race and ethnicity, age, health conditions, exposures, etc.) and are often limited by ascertainment biases, precluding assessments of effect moderation and limiting generalizability. Individual datasets with small or modest sample sizes reduce statistical power and increase the likelihood of spurious inferences.
In the absence of very large, single-source, and richly phenotyped sleep datasets, there is a need to make multiple relevant datasets centrally accessible, and to define and represent those data so that they can be readily combined. For any data type, heterogeneity in data collection procedures, annotations, and labeling reduce the efficiency of accessing, combining, and analyzing such data. These issues are especially pertinent for sleep data for which large volumes are data are routinely collected for clinical purposes by thousands of sleep laboratories per year and by numerous research programs, but are collected using protocols that are largely not standardized with respect to collection procedures (both device-based and patient-reported) and labeling of data elements [12]. Therefore, a major need for a sleep data repository is to ensure that data ingested from diverse sources are well-curated, clearly annotated, and harmonized at various semantic and signal processing levels, ideally using standards that support the needs of the sleep as well as informatics communities. Providing access to well-annotated data from multiple sources also provides scientific opportunities to understand sources of variation due to technical (due to sensors, scorers, algorithms; as described [13]) and non-technical (socio-demographic, environmental, and genetic) factors [14]. This information can guide the interpretation of data from various sources, inform best practices in data collection, and identify important population sources of variation in biological processes.
NSRR: Content and Access
The NSRR provides the scientific community with centralized and secure access to growing numbers of datasets that include objective and/or self-reported measurements of sleep and/or circadian rhythm, including data from polysomnography, actigraphy, and patient-reported questionnaires. Data include raw physiological signal data, summary sleep data, and annotations and associated metadata, with ongoing work to generate and share the results of advanced signal analyses that quantify neurophysiological, electrocardiographic, and respiratory-related metrics. As available (for each dataset), demographic, anthropometry, medical history, laboratory, and clinical outcome data are included. Data are ingested using a process that includes documentation of ethical review and any limitations to data sharing, ascertainment that data are de-identified and do not include Protected Health Information, and review of the integrity of the incoming data. The NSRR is supported by a contract from the National Heart Lung and Blood Institute (NHLBI) with Brigham and Women’s Hospital (BWH), Boston MA; regulatory procedures are compliant with BWH’s institutional policies.
Data is made available to the community through a secure on-line data use agreement and tools for efficiently downloading large files [15, 16]. Data distribution and use are governed by each dataset’s original data use limitations. Investigators who request access to specific datasets (Supplemental Figure S1) and consent to required data access and use agreements (with BWH) can directly download files that may include polysomnograms recordings encoded as European Data Format (EDF) files, polysomnogram annotation files (e.g. containing scored “events” or labeled epochs), demographic information with linked variables, forms used for data collection, and study documentation. To date, 7,820 data requests from 11,373 registered users were submitted and 4,989 were granted access to datasets hosted by the NSRR (unapproved data requests mostly were due to requests that were inconsistent with dataset-specific participant consent, such as requests by a commercial entity to use data unapproved for commercial use). In total, 1.36 petabytes of information have been downloaded from the repository with an average download rate of 25-35 terabytes per month. As a result of this activity and the subsequent use of downloaded information in secondary research and analysis, the NSRR has been cited as a principal resource in approximately 400 indexed publications [17].
The data within the NSRR were initially seeded by data collected under the auspices of the Sleep Reading Centers (directed by SR), with later data contributed by investigators responding to journal or sponsor requirements to share data, or as a result of an NSRR-driven data sharing campaign. As of November 2023, a total of 27 datasets had been incorporated into the NSRR, including data from 16 cohort or observational studies, 6 clinical trials, 1 experimental database, 3 clinical data banks, and 1 animal study [18–33]. These include datasets from a number of landmark studies in the field of sleep research conducted from 1995 to the present (Table 1). Collectively, sources include de-identified data from 46 214 subjects including (1) polysomnogram recordings with overnight multi-channel neurophysiological, cardiac, and respiratory data, (2) actigraphy recordings capturing multi-day 24-hour sleep-wake patterns, (3) responses to surveys asking questions about sleep habits, sleep quality, and the adverse effects of disrupted sleep, and (4) demographic information, anthropometric measurements, biochemical parameters, lifestyle behaviors, and data pertaining to comorbid medical conditions, outcomes, and events. Several “at-a-glance” matrices provide researchers with the ability to quickly identify datasets that include datatypes most relevant to their needs. The broad range of data within the NSRR is organized into conceptual domains with nested subdomains, as summarized (Figure 1).
. | Subjects . | Age range . | Time frame . | PSG/HSAT count . | Actigraphy count . | Variable count . | Sleep test type . | Average actigraphy days . | On dbGaP . |
---|---|---|---|---|---|---|---|---|---|
Sleep Heart Health Study | 5804 | 40–89 | 1995–2010 | 8444 | 0 | 1896 | II | 0 | Yes |
Honolulu-Asia Aging Study of sleep apnea | 718 | 79–97 | 1999–2000 | 717 | 0 | 11 | II | 0 | No |
Wisconsin sleep cohort | 1123 | 37–85 | 2000–2015 | 3671 | 0 | 360 | I | 0 | No |
Cleveland Family Study | 735 | 6–88 | 2001–2006 | 730 | 0 | 2657 | I | 0 | Yes |
Study of osteoporotic fractures | 461 | 65–89 | 2002–2003 | 453 | 0 | 1146 | II | 0 | Yes |
Apnea Positive Pressure Long-term Efficacy Study | 1516 | 18–84 | 2003–2008 | 1104 | 0 | 353 | I | 0 | No |
Outcomes of Sleep Disorders in Older Men (MrOS Sleep Study) | 2911 | 65–89 | 2003–2012 | 3933 | 0 | 649 | II | 0 | Yes |
Cleveland Children’s Sleep and Health Study | 517 | 16–19 | 2006–2010 | 515 | 0 | 257 | I | 0 | No |
Childhood adenotonsillectomy trial | 1243 | 5–9 | 2007–2012 | 1639 | 0 | 2901 | I | 0 | No |
Home positive airway pressure | 373 | 20–80 | 2008–2010 | 414 | 0 | 120 | I/III* | 0 | No |
Hispanic community health study/study of Latinos | 16,415 | 18–76 | 2009–2013 | 12 088 | 1,887 | 1032 | III | 7 | Yes |
Heart biomarker evaluation in apnea treatment | 318 | 45–75 | 2010–2012 | 591 | 0 | 790 | III | 0 | No |
Multi-ethnic study of atherosclerosis | 2237 | 54–95 | 2010–2013 | 2056 | 2,159 | 627 | II | 7 | Yes |
Nulliparous pregnancy outcomes study monitoring mothers-to-be | 3012 | 14–44 | 2011–2013 | 5341 | 0 | 392 | III | 0 | Yes |
Best apnea interventions in research | 169 | 46–76 | 2011–2014 | 518 | 0 | 205 | III | 0 | No |
Apnea, bariatric surgery, and CPAP study | 49 | 26–64 | 2011–2014 | 132 | 0 | 108 | I | 0 | No |
One year of actigraphy | 1 | 62 | 2016–2017 | 0 | 1 | 0 | n/a | 0 | No |
The economic consequences of increasing sleep among the urban poor | 597 | 25–55 | 2017–2019 | 0 | 597 | 0 | n/a | 28 | No |
Forced desynchrony with and without chronic sleep restriction | 28 | 20–34 | 2000–2016 | 1000 | 28 | 32 | I | 25 | No |
Nationwide Children’s Hospital Sleep DataBank | 3673 | 0–58 | 2017–2019 | 3984 | 0 | 31 | I | 0 | No |
Maternal sleep in pregnancy and the fetus | 106 | 18–42 | 2015–2019 | 106 | 0 | 37 | I | 0 | No |
Assessing nocturnal sleep/wake effects on risk of suicide | 971 | 18–52 | 2020–2021 | 0 | 0 | 301 | n/a | 0 | No |
Efficacy assessment of NOP agonists in non-human primates | 5 | 14–19 | 2019 | 10 | 0 | 0 | I | 0 | No |
Mignot nature communications | 3000 | 18–91 | Varies | 1438 | 0 | 0 | I | 0 | No |
Stanford technology analytics and genomics in sleep | 1881 | 13–84 | 2018–2019 | 2055 | 2055 | 441 | I | 7 | No |
Cox and Fell (2020) sleep medicine reviews | 5 | 0–100 | 3 | 0 | 0 | I | 0 | No | |
Sleep health in infancy and early childhood | 433 | 0-2 | 2016–2020 | 0 | 1,257 | 319 | n/a | 7 | No |
Sleep disordered breathing, ApoE and lipid metabolism | 712 | 13–90 | 2003–2007 | 712 | 0 | 67 | I | 0 | No |
. | Subjects . | Age range . | Time frame . | PSG/HSAT count . | Actigraphy count . | Variable count . | Sleep test type . | Average actigraphy days . | On dbGaP . |
---|---|---|---|---|---|---|---|---|---|
Sleep Heart Health Study | 5804 | 40–89 | 1995–2010 | 8444 | 0 | 1896 | II | 0 | Yes |
Honolulu-Asia Aging Study of sleep apnea | 718 | 79–97 | 1999–2000 | 717 | 0 | 11 | II | 0 | No |
Wisconsin sleep cohort | 1123 | 37–85 | 2000–2015 | 3671 | 0 | 360 | I | 0 | No |
Cleveland Family Study | 735 | 6–88 | 2001–2006 | 730 | 0 | 2657 | I | 0 | Yes |
Study of osteoporotic fractures | 461 | 65–89 | 2002–2003 | 453 | 0 | 1146 | II | 0 | Yes |
Apnea Positive Pressure Long-term Efficacy Study | 1516 | 18–84 | 2003–2008 | 1104 | 0 | 353 | I | 0 | No |
Outcomes of Sleep Disorders in Older Men (MrOS Sleep Study) | 2911 | 65–89 | 2003–2012 | 3933 | 0 | 649 | II | 0 | Yes |
Cleveland Children’s Sleep and Health Study | 517 | 16–19 | 2006–2010 | 515 | 0 | 257 | I | 0 | No |
Childhood adenotonsillectomy trial | 1243 | 5–9 | 2007–2012 | 1639 | 0 | 2901 | I | 0 | No |
Home positive airway pressure | 373 | 20–80 | 2008–2010 | 414 | 0 | 120 | I/III* | 0 | No |
Hispanic community health study/study of Latinos | 16,415 | 18–76 | 2009–2013 | 12 088 | 1,887 | 1032 | III | 7 | Yes |
Heart biomarker evaluation in apnea treatment | 318 | 45–75 | 2010–2012 | 591 | 0 | 790 | III | 0 | No |
Multi-ethnic study of atherosclerosis | 2237 | 54–95 | 2010–2013 | 2056 | 2,159 | 627 | II | 7 | Yes |
Nulliparous pregnancy outcomes study monitoring mothers-to-be | 3012 | 14–44 | 2011–2013 | 5341 | 0 | 392 | III | 0 | Yes |
Best apnea interventions in research | 169 | 46–76 | 2011–2014 | 518 | 0 | 205 | III | 0 | No |
Apnea, bariatric surgery, and CPAP study | 49 | 26–64 | 2011–2014 | 132 | 0 | 108 | I | 0 | No |
One year of actigraphy | 1 | 62 | 2016–2017 | 0 | 1 | 0 | n/a | 0 | No |
The economic consequences of increasing sleep among the urban poor | 597 | 25–55 | 2017–2019 | 0 | 597 | 0 | n/a | 28 | No |
Forced desynchrony with and without chronic sleep restriction | 28 | 20–34 | 2000–2016 | 1000 | 28 | 32 | I | 25 | No |
Nationwide Children’s Hospital Sleep DataBank | 3673 | 0–58 | 2017–2019 | 3984 | 0 | 31 | I | 0 | No |
Maternal sleep in pregnancy and the fetus | 106 | 18–42 | 2015–2019 | 106 | 0 | 37 | I | 0 | No |
Assessing nocturnal sleep/wake effects on risk of suicide | 971 | 18–52 | 2020–2021 | 0 | 0 | 301 | n/a | 0 | No |
Efficacy assessment of NOP agonists in non-human primates | 5 | 14–19 | 2019 | 10 | 0 | 0 | I | 0 | No |
Mignot nature communications | 3000 | 18–91 | Varies | 1438 | 0 | 0 | I | 0 | No |
Stanford technology analytics and genomics in sleep | 1881 | 13–84 | 2018–2019 | 2055 | 2055 | 441 | I | 7 | No |
Cox and Fell (2020) sleep medicine reviews | 5 | 0–100 | 3 | 0 | 0 | I | 0 | No | |
Sleep health in infancy and early childhood | 433 | 0-2 | 2016–2020 | 0 | 1,257 | 319 | n/a | 7 | No |
Sleep disordered breathing, ApoE and lipid metabolism | 712 | 13–90 | 2003–2007 | 712 | 0 | 67 | I | 0 | No |
PSG: polysomnography; HSAT: home sleep apnea test.
Sleep test type: Type I: attended studies that minimally include the following channels: EEG, EOG, ECG/Heart rate, chin EMG, limb EMG, respiratory effort at thorax and abdomen, oxygen saturation, air flow from nasal canula or thermistor. Type II: full polysomnograms (as in Type I) but performed in an unattended setting. Type III: home sleep test (HST), performed in an unattended setting with a minimum of 4 channels, minimally including two respiratory movement/airflow, 1 ECG/heart rate, and 1 oxygen saturation channel. Type IV: home sleep test (HST), performed in an unattended setting with a minimum of 3 channels that allows calculation of an AHI or RDI as the result of measuring airflow or thoracoabdominal movement.
*Home positive airway pressure: Type I in baseline and Type III in follow up visit.
. | Subjects . | Age range . | Time frame . | PSG/HSAT count . | Actigraphy count . | Variable count . | Sleep test type . | Average actigraphy days . | On dbGaP . |
---|---|---|---|---|---|---|---|---|---|
Sleep Heart Health Study | 5804 | 40–89 | 1995–2010 | 8444 | 0 | 1896 | II | 0 | Yes |
Honolulu-Asia Aging Study of sleep apnea | 718 | 79–97 | 1999–2000 | 717 | 0 | 11 | II | 0 | No |
Wisconsin sleep cohort | 1123 | 37–85 | 2000–2015 | 3671 | 0 | 360 | I | 0 | No |
Cleveland Family Study | 735 | 6–88 | 2001–2006 | 730 | 0 | 2657 | I | 0 | Yes |
Study of osteoporotic fractures | 461 | 65–89 | 2002–2003 | 453 | 0 | 1146 | II | 0 | Yes |
Apnea Positive Pressure Long-term Efficacy Study | 1516 | 18–84 | 2003–2008 | 1104 | 0 | 353 | I | 0 | No |
Outcomes of Sleep Disorders in Older Men (MrOS Sleep Study) | 2911 | 65–89 | 2003–2012 | 3933 | 0 | 649 | II | 0 | Yes |
Cleveland Children’s Sleep and Health Study | 517 | 16–19 | 2006–2010 | 515 | 0 | 257 | I | 0 | No |
Childhood adenotonsillectomy trial | 1243 | 5–9 | 2007–2012 | 1639 | 0 | 2901 | I | 0 | No |
Home positive airway pressure | 373 | 20–80 | 2008–2010 | 414 | 0 | 120 | I/III* | 0 | No |
Hispanic community health study/study of Latinos | 16,415 | 18–76 | 2009–2013 | 12 088 | 1,887 | 1032 | III | 7 | Yes |
Heart biomarker evaluation in apnea treatment | 318 | 45–75 | 2010–2012 | 591 | 0 | 790 | III | 0 | No |
Multi-ethnic study of atherosclerosis | 2237 | 54–95 | 2010–2013 | 2056 | 2,159 | 627 | II | 7 | Yes |
Nulliparous pregnancy outcomes study monitoring mothers-to-be | 3012 | 14–44 | 2011–2013 | 5341 | 0 | 392 | III | 0 | Yes |
Best apnea interventions in research | 169 | 46–76 | 2011–2014 | 518 | 0 | 205 | III | 0 | No |
Apnea, bariatric surgery, and CPAP study | 49 | 26–64 | 2011–2014 | 132 | 0 | 108 | I | 0 | No |
One year of actigraphy | 1 | 62 | 2016–2017 | 0 | 1 | 0 | n/a | 0 | No |
The economic consequences of increasing sleep among the urban poor | 597 | 25–55 | 2017–2019 | 0 | 597 | 0 | n/a | 28 | No |
Forced desynchrony with and without chronic sleep restriction | 28 | 20–34 | 2000–2016 | 1000 | 28 | 32 | I | 25 | No |
Nationwide Children’s Hospital Sleep DataBank | 3673 | 0–58 | 2017–2019 | 3984 | 0 | 31 | I | 0 | No |
Maternal sleep in pregnancy and the fetus | 106 | 18–42 | 2015–2019 | 106 | 0 | 37 | I | 0 | No |
Assessing nocturnal sleep/wake effects on risk of suicide | 971 | 18–52 | 2020–2021 | 0 | 0 | 301 | n/a | 0 | No |
Efficacy assessment of NOP agonists in non-human primates | 5 | 14–19 | 2019 | 10 | 0 | 0 | I | 0 | No |
Mignot nature communications | 3000 | 18–91 | Varies | 1438 | 0 | 0 | I | 0 | No |
Stanford technology analytics and genomics in sleep | 1881 | 13–84 | 2018–2019 | 2055 | 2055 | 441 | I | 7 | No |
Cox and Fell (2020) sleep medicine reviews | 5 | 0–100 | 3 | 0 | 0 | I | 0 | No | |
Sleep health in infancy and early childhood | 433 | 0-2 | 2016–2020 | 0 | 1,257 | 319 | n/a | 7 | No |
Sleep disordered breathing, ApoE and lipid metabolism | 712 | 13–90 | 2003–2007 | 712 | 0 | 67 | I | 0 | No |
. | Subjects . | Age range . | Time frame . | PSG/HSAT count . | Actigraphy count . | Variable count . | Sleep test type . | Average actigraphy days . | On dbGaP . |
---|---|---|---|---|---|---|---|---|---|
Sleep Heart Health Study | 5804 | 40–89 | 1995–2010 | 8444 | 0 | 1896 | II | 0 | Yes |
Honolulu-Asia Aging Study of sleep apnea | 718 | 79–97 | 1999–2000 | 717 | 0 | 11 | II | 0 | No |
Wisconsin sleep cohort | 1123 | 37–85 | 2000–2015 | 3671 | 0 | 360 | I | 0 | No |
Cleveland Family Study | 735 | 6–88 | 2001–2006 | 730 | 0 | 2657 | I | 0 | Yes |
Study of osteoporotic fractures | 461 | 65–89 | 2002–2003 | 453 | 0 | 1146 | II | 0 | Yes |
Apnea Positive Pressure Long-term Efficacy Study | 1516 | 18–84 | 2003–2008 | 1104 | 0 | 353 | I | 0 | No |
Outcomes of Sleep Disorders in Older Men (MrOS Sleep Study) | 2911 | 65–89 | 2003–2012 | 3933 | 0 | 649 | II | 0 | Yes |
Cleveland Children’s Sleep and Health Study | 517 | 16–19 | 2006–2010 | 515 | 0 | 257 | I | 0 | No |
Childhood adenotonsillectomy trial | 1243 | 5–9 | 2007–2012 | 1639 | 0 | 2901 | I | 0 | No |
Home positive airway pressure | 373 | 20–80 | 2008–2010 | 414 | 0 | 120 | I/III* | 0 | No |
Hispanic community health study/study of Latinos | 16,415 | 18–76 | 2009–2013 | 12 088 | 1,887 | 1032 | III | 7 | Yes |
Heart biomarker evaluation in apnea treatment | 318 | 45–75 | 2010–2012 | 591 | 0 | 790 | III | 0 | No |
Multi-ethnic study of atherosclerosis | 2237 | 54–95 | 2010–2013 | 2056 | 2,159 | 627 | II | 7 | Yes |
Nulliparous pregnancy outcomes study monitoring mothers-to-be | 3012 | 14–44 | 2011–2013 | 5341 | 0 | 392 | III | 0 | Yes |
Best apnea interventions in research | 169 | 46–76 | 2011–2014 | 518 | 0 | 205 | III | 0 | No |
Apnea, bariatric surgery, and CPAP study | 49 | 26–64 | 2011–2014 | 132 | 0 | 108 | I | 0 | No |
One year of actigraphy | 1 | 62 | 2016–2017 | 0 | 1 | 0 | n/a | 0 | No |
The economic consequences of increasing sleep among the urban poor | 597 | 25–55 | 2017–2019 | 0 | 597 | 0 | n/a | 28 | No |
Forced desynchrony with and without chronic sleep restriction | 28 | 20–34 | 2000–2016 | 1000 | 28 | 32 | I | 25 | No |
Nationwide Children’s Hospital Sleep DataBank | 3673 | 0–58 | 2017–2019 | 3984 | 0 | 31 | I | 0 | No |
Maternal sleep in pregnancy and the fetus | 106 | 18–42 | 2015–2019 | 106 | 0 | 37 | I | 0 | No |
Assessing nocturnal sleep/wake effects on risk of suicide | 971 | 18–52 | 2020–2021 | 0 | 0 | 301 | n/a | 0 | No |
Efficacy assessment of NOP agonists in non-human primates | 5 | 14–19 | 2019 | 10 | 0 | 0 | I | 0 | No |
Mignot nature communications | 3000 | 18–91 | Varies | 1438 | 0 | 0 | I | 0 | No |
Stanford technology analytics and genomics in sleep | 1881 | 13–84 | 2018–2019 | 2055 | 2055 | 441 | I | 7 | No |
Cox and Fell (2020) sleep medicine reviews | 5 | 0–100 | 3 | 0 | 0 | I | 0 | No | |
Sleep health in infancy and early childhood | 433 | 0-2 | 2016–2020 | 0 | 1,257 | 319 | n/a | 7 | No |
Sleep disordered breathing, ApoE and lipid metabolism | 712 | 13–90 | 2003–2007 | 712 | 0 | 67 | I | 0 | No |
PSG: polysomnography; HSAT: home sleep apnea test.
Sleep test type: Type I: attended studies that minimally include the following channels: EEG, EOG, ECG/Heart rate, chin EMG, limb EMG, respiratory effort at thorax and abdomen, oxygen saturation, air flow from nasal canula or thermistor. Type II: full polysomnograms (as in Type I) but performed in an unattended setting. Type III: home sleep test (HST), performed in an unattended setting with a minimum of 4 channels, minimally including two respiratory movement/airflow, 1 ECG/heart rate, and 1 oxygen saturation channel. Type IV: home sleep test (HST), performed in an unattended setting with a minimum of 3 channels that allows calculation of an AHI or RDI as the result of measuring airflow or thoracoabdominal movement.
*Home positive airway pressure: Type I in baseline and Type III in follow up visit.

(a) and (b) Measures and Instruments across the National Sleep Research Resource (NSRR) Sleep Questionnaires and Polysomnography Domain. Bar lengths represent the number of variables in each domain or subdomain aggregated across the full range of datasets. In (a), the colored bar represents variables from specific survey instruments, while the grey bar represents standalone variables. MEQ: Horne-Ostberg Morningness Eveningness Questionnaire; SDS Checklist-25: Sleep Disorders Symptom Checklist-25; DDNSI: Disturbing Dream and Nightmare Severity Index; Calgary SAQLI: Calgary Sleep Apnea Quality of Life Index; OSA-18: Obstructive Sleep Apnea Quality of Life Questionnaire; PSQ: Pediatric Sleep Questionnaire; SEMA: Self-Efficacy Measure of Sleep Apnea; SAQLI: Sleep Apnea Quality of Life Index; ESS: Epworth Sleepiness Scale; BRISC: The Brief Index of Sleep Control; PSQI: Pittsburgh Sleep Quality Index; PSQI: Pittsburgh Sleep Quality Index; FOSQ: Functional Outcomes of Sleep Questionnaire; ISI: Insomnia Severity Index; PROMIS SD: PROMIS Sleep Disturbance; PROMIS SRI: PROMIS Sleep Related Impairment; WHIIRS: Women’s Health Initiative Insomnia Rating Scale. * The full list of NSRR domains and subdomains is provided in Supplemental Figure S3.
Adherence to FAIR Principles
Adherence to FAIR (findable, accessible, interoperable, and reusable) principles [34] is a central tenet for modern data management and is included in recent federal data sharing requirements. At each stage of development of the NSRR, efforts were focused on designing and implementing a system that adheres to these principles. Prior to its initial release, the combined input of computer scientists, data scientists, sleep experts, and informaticians fostered the iterative development of a system targeted to make hosted data accessible by incorporating (1) a streamlined registration process that enables users to submit requests for access to multiple datasets under a unified data access and use agreement, and (2) a secure mechanism for the reliable transfer of downloadable EDF files. During the ingestion process, the NSRR team collaborates with the contributor to further: (3) develop consistent study documentation, (4) provide standardized metadata for key variables, (5) map selected variables to standardized terms and concept tags, and (6) conduct signal processing to generate canonical sets of harmonized sleep signals with standardized labels and sampling rates. These data ingestion procedures were codified and modified to reflect the requirements of NIH’s 2023 Data Management and Sharing Plan requirements, including specification of data formats and metadata standards, and have been publicized through interactions with professional societies, social media, and NSRR-sponsored webinars.
Specific Challenges: Heterogeneity in Data
There are multiple sources of heterogeneity in sleep data that impact the ability to easily find, combine, and reuse data, some of which reflect challenges faced by any federated data repository, while other variations are more specific to sleep research. As discussed in other publications [12, 35], and detailed below, the variability of sleep data reflects variations in collection procedures, annotations, and formatting (Figure 2), each of which requires specific approaches for improving the usability of the data.

Concept map of selected sources of heterogeneity in sleep data. This diagram summarizes the sources of heterogeneity in sleep and circadian data discussed in this paper.
Variability of polysomnography data collection—The American Academy of Sleep Medicine (AASM) publishes guidelines for polysomnography that include minimal requirements related to channels recorded, sensors used, and filter and sampling rates. However, these guidelines allow for a broad latitude in how data are collected (e.g. several permissible sensor types); annotation procedures (e.g. does not prohibit use of “hot keys” for annotating events such as arousals or apneas); procedures for achieving consistent polarity of recorded signals; and how data are labeled (e.g. there is no widely-adopted and comprehensive standardized nomenclature used to label the physiological channels or event annotations). Further variability in analysis of EEG, EMG, EOG, and ECG data may result from variations in choice of reference electrodes and electrode derivations, which can markedly impact the amplitude and content (e.g. features) of the measured signals, and often is not well-documented. Variations in equipment hardware can output signals that are filtered in poorly documented ways, impacting secondary analysis of measures such as airflow limitation. Heterogeneity also arises from the expanded use of diagnostic devices other than the overnight in-laboratory polysomnogram (Type I device). In fact, approximately 70% of clinical sleep studies currently utilize home-based devices that collect a limited number and variable types of physiological data [36]. These devices, which are increasing in numbers and diversity, are categorized as Types II, III, and IV, with only Types I and II including EEG data collection. Moreover, the Type III and IV devices include a wide variety of sensors, some of which are not routinely used in the “gold-standard” Type I studies (e.g. peripheral arterial tonometry), and often do not include sensors traditionally considered to be core for defining event subtypes (e.g. nasal flow). This variability in data types is reflected in the data ingested into NSRR. Some datasets utilized protocols for collecting comprehensive polysomnography studies monitored by sleep technicians in a laboratory setting, while others relied on a variety of home sleep study devices. As a result, across studies (and sometimes within studies when data were aggregated across different sleep laboratories), different recording approaches were used to measure airflow, oxygenation, respiratory effort, muscle tone, limb movement, eye movement, and brain activity. Such differences not only affect the availability of core parameters within and across studies, but also the ability to define events and create uniform metadata, and the overall precision and accuracy of measurements.
Event Definitions—The AASM publishes criteria for scoring specific events within the sleep study (respiratory events, stages, leg movements, etc.) [37]. However, the criteria for scoring events (particularly hypopneas) have changed multiple times over the last 15 years. These changes may result in large differences in disease classification [38, 39]. In addition, many key terms used to annotate events and define sleep disorders have evolved [40]. The original metric used to classify sleep disordered breathing focused on quantifying the number of apneic events per hour to calculate an apnea index [41]. Subsequent definitions expanded criteria to include hypopneas characterized by reductions in airflow with decreased oxygen saturation to calculate a broader index, which initially was labeled a respiratory disturbance index, and later an apnea-hypopnea index (AHI) [18]. As the AHI became accepted as a standard measure of obstructive sleep apnea, thresholds were proposed to classify disease as mild, moderate, and severe disease [42, 43]. However, the AHI was shown to be widely variable depending on which definitions were applied to define hypopneas (with variable criteria for defining critical changes in breathing amplitude and/or inclusion of desaturation and/or arousal). Due to lack of consensus, the AASM even published two definitions characterized as “recommended” and “alternative.” Subsequent revisions proposed more unified “recommended” and “acceptable” hypopnea definitions that still vary with respect to criteria related to associated oxygen saturation and/or arousal [44, 45]. Additional measures used to characterize event subtypes include detection of increased respiratory effort with cortical arousal classified as a respiratory effort-related arousal (RERA) and a summation of the AHI and RERA index designated as the respiratory disturbance index (RDI) [37, 46].
Data formatting—Sleep data are routinely collected and saved using a variety of proprietary software dictated by the specific equipment used. To avoid the need to access multiple software tools for data analysis and to standardize the presentation of data, the NSRR requests data contributors to transfer polysomnography data as EDF files (https://edfplus.info/), a standardized format developed to promote sleep data exchange [47, 48]. However, many laboratories do not routinely save data in EDF, requiring support for exporting and de-identifying data for data-sharing. While the NSRR can assist contributors with these tasks (i.e. providing deidentification tools or guidance on best practices for exporting data), these procedures are generally not automated, and their implementation can delay data sharing efforts. In addition to sometimes containing subtle corruptions, e.g. due to truncations of data transfers, EDF files themselves can vary in content and format: for example, (1) with continuous or discontinuous (gapped) recordings, (2) missing physical unit or transducer header fields, (3) inappropriate dynamic ranges or misspecified units, (4) the presence of annotations encoded within the EDF, or (5) with single night data split across multiple EDF files. Further, annotation files can occasionally be temporally misaligned with respect to the underlying signal data.
Many users are interested in training algorithms to automatically score events within the polysomnogram, or to extract novel metrics based on scored events. Those goals require access to the annotation files that provide tabular scored events, delineated by their duration, inter-event intervals, and associated features (e.g. desaturation). However, such files are encoded in a range of different formats, and the labeling, encoding, and directory structures of associated data files vary.
Actigraphy—Actigraphy data are also saved in a variety of formats and lack a single “standard.” There are scant published recommendations that guide data collection, with variability in what data are saved (counts, accelerometry motion), sampling rates, and auxiliary data (light, event markers, etc.).
Patient-reported outcomes—There is not a standard set of Common Data Elements recommended for sleep or circadian research. Accordingly, the datasets within the NSRR include a variety of sleep questionnaires such as the Epworth Sleepiness Scale, Women’s Health Initiative Insomnia Rating Scale, Pediatric Sleep Questionnaire, Pittsburgh Sleep Quality Index, and Functional Outcomes of Sleep Questionnaire. Some patient-reported data are based on single items, subsets of items, or paraphrased questions abstracted from one or more instruments, with response categories and/or rating scales that are different from the validated survey instruments. Moreover, the reference period (e.g. “in the past two weeks”) vary across studies and often are not preserved in the data dictionary submitted by data owners, which pose challenges to metadata documentation and chronicity assessment of certain sleep disorders. Many items within questionnaires overlap different domains (e.g. insomnia vs sleep quality; sleepiness vs functional impairment), which makes mapping those items to specific domains challenging.
Approaches for Addressing Sleep Data Heterogeneity, Developing Metadata, and Data Harmonization
To minimize the effects of heterogeneity while providing opportunities to assess and learn from sources of heterogeneity, data are ingested using a well-defined process that captures critical metadata at the study and variable level. Innovative approaches that NSRR has employed to address data heterogeneity have stemmed from integrated initiatives that include (1) specification of study-level and variable-level metadata, including use of compositional terminology and mapping of terms to a common standard, as possible, (2) standardization of sleep-wake period information, (3) post-processing polysomnography data to standardize and annotate the data and channel labels, and (4) integration and extension of harmonized variables. Figure 3 outlines the overview for data ingestion and metadata generation.

Outline of data ingestion and metadata generation processes in NSRR. This diagram illustrates the workflow required to generate and curate enhanced metadata in the NSRR. The original metadata consists of PDF files of study manuals, forms, and data dictionaries. Information extracted from these resources is categorized as study-level, file-level, or variable-level metadata during the data ingestion process. This structured metadata is reviewed and extended to generate several output products including: (1) semi-structured metadata in the form of a README file that serves as a dataset introduction page on the NSRR website; (2) a version-controlled standardized data dictionary that incorporates standard conceptual domains/subdomains and enhanced variable-level metadata including relevant study-level metadata, provenance information, hyperlinks to data collection forms, and standardized tags; (3) summary statistics for each variable stratified by common demographic groupings; (4) harmonized data for selected groups of variables that are comparable across datasets; (5) enhanced search results using standardized NSRR tags; and (6) an at-a-glance matrix showing the availability of data by category and PSG channel. The right panel shows a screenshot of variable-level metadata for the “ahi” variable integrated into the NSRR after review and curation.
Specification of study-level metadata.
To generate a template for the specification of study-level metadata we adopted a reporting format based on checklists promoted by the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) initiative [49]. Using foundational NSRR datasets as models, we traced the data collection, processing, and analysis approaches involved in different types of studies to identify sources of heterogeneity that could be specified as metadata elements. We compiled a set of key value pairs for each of these elements that we used to generate a metadata intake form incorporating (1) a study overview section providing information about the investigator(s), support, study design, eligibility and exclusion criteria, exposures, interventions, outcomes, access restrictions, and a list of validated survey instrument for collecting patient-report outcomes, (2) an actigraphy data section providing information about data collection and processing including recording devices, software, sampling rates, annotation methods, and definitions of specific times and periods of interest, and (3) a polysomnography data section providing information about data collection and processing including equipment, montages, sampling rates, data formats, scoring methods, and definitions of thresholds used to identify hypopnea events. This form has been deployed as a spreadsheet incorporating selectable and extensible options for each element that has been integrated into the NSRR data deposition process. Information abstracted from completed forms has been used to generate a matrix that provides a sortable filtered overview of the studies included in the NSRR with direct links to available datasets (19).
Specification of variable-level metadata
In the early stages of development of the NSRR we recognized that there was significant variation in the range and depth of metadata available for variables in many datasets. While we knew that advanced cross-cohort search capabilities might make it possible to retrieve similarly labelled variables from different datasets [50, 51], we were also aware that inadequate characterization of the provenance of these variables would make it difficult to determine if they were comparable.
The original plan for the NSRR called for the definition of a sleep research ontology that would serve as the basis for a structured vocabulary to characterize dataset variables. While this approach was conceptually appealing, in practice the development of an extensible ontology proved to be a cumbersome process that led to significant delays in the deployment of usable resources. One challenge was the ability to readily expand upon and integrate with existing ontologies (SNOMED and LOINC) due to their limited coverage of sleep terms. As a workable alternative, we compiled a set of canonical terms abstracted from variables included in the larger foundational datasets. These terms were edited for clarity to provide precise definitions of origins, thresholds, and states before they were added to a curated data dictionary. We were subsequently able to link variables in each dataset to terms in this data dictionary, enabling them to serve as points of connection for cross-cohort queries. When it proved to be feasible, additional metadata elements were appended to linked variables in the form of tags providing information about the source, timing, equipment, and methodology used to collect data. We also linked, when able, terms to sleep-related terms in the National Library of Medicine’s (NLM) Common Data Element library. However, coverage of sleep data within the NLM is currently limited.
This parallel approach to the specification of study-level and variable-level metadata has streamlined the workflow required to integrate datasets submitted for inclusion in the NSRR. Minimizing the ambiguity of metadata mapped at both levels effectively serves to improve the accuracy of cross-cohort queries conducted to identify comparable variables retrieved from disparate datasets.
Compositional terminology.
A special problem in specifying metadata and adopting uniform terminology relates to the marked variation in definitions of apnea and hypopneas, both over time and across datasets. To address this problem, we developed a compositional terminology configured to generate compound labels that can be parsed to provide fully qualified metadata pertaining to specific variables. To accommodate the full range of variable-level metadata pertaining to indices of sleep disordered breathing, we developed a compositional terminology modeled after the post-coordination approach utilized by the SNOMED CT system to define complex concepts [52]. This flexible scheme can be used to generate a compound label for each variable comprised of a root component and qualifying suffix components. The root component includes linked abbreviations that designate the type of event measured (Event), the measurement recorded (Data type), and any qualifiers used to characterize the measurement (Data qualifier) (Supplementary Table S1). A suffix component separated by an underscore can be appended to designate a sleep stage and body position, and additional suffix components can be added to designate a data source and level of oxygen saturation or desaturation. This scheme also incorporates precompiled suffix components that correspond to specific criteria used to identify hypopneas based on varying definitions. The labels generated using this compositional terminology can be parsed by algorithms to enable large scale mapping and harmonization of variables. Those variables that are mapped to labels can be automatically converted between wide and long data formats. When converted to a long data format, the information encoded in each label can be extracted to generate a profile of semantic terms. This approach has proven to work well with polysomnography and actigraphy data which tend to have many permutations of similar measures. We are also assessing the utility of apply a compositional terminology to other types of sleep data such as self-reported questionnaire items.
Defining core sleep-wake information.
One of the initial challenges in sleep data standardization relates to the inconsistency in the terminology used to specify time points and intervals describing sleep and wake periods. Depending on the context and usage, “time” might refer to a specific point in time or to an interval between two time points. While “duration” and “period” could both be taken to correspond to intervals, they were often used interchangeably in protocol descriptions and data dictionaries without any indication of whether they referred to intervals between designated time points or to specific intervals when subjects were determined to be awake or asleep.
Recognizing the need to develop internally consistent terms to distinguish time points and intervals prompted us to compile a list of key concepts used to define sleep-wake intervals in available NSRR study protocols. These included specific time points, intervals between time points, and states within intervals. Review of study documentation and research publications helped to identify commonly used terms that could be mapped to specific concepts. This in turn enabled us to designate standardized terms that incorporate precise definitions of “time,” “period,” and “duration” that can be used to delineate distinct intervals, and to visualize their inter-relationships graphically (Figure 4). In the terminology developed based on these definitions, “time” refers to a specific point in time that is either recorded as a clock time or marked by when an event starts or ends. “Period” refers to a continuous interval between two specific time points defined a priori, while “duration” refers to the sum of the lengths of multiple intervals describing a specific state or condition, wherein the state can be further specified as sleep stages. Adoption of this standardized terminology allows for unambiguous demarcation of the intervals and sleep-wake states used to characterize the state-specificity of respiratory, cardiac, electroencephalographic, and movement-related events. Use of structured definitions also allowed inconsistencies in data calculations to be identified. For example, in one instance, applying standardized nomenclature identified that a summary respiratory index was calculated erroneously to include events in the interval between recording start and end times rather than only during the time asleep.

Sleep–wake terminology schema. To disambiguate notations that refer to different time points, sleep intervals, periods, and durations, the NSRR utilizes a visual schema that identifies (a) Time points (going to bed, falling asleep, waking up during sleep, waking up after sleep, getting out of bed), (b) intervals (recording, in-bed), and (c) states (awake, asleep). Terminologies based on these designations are organized in reference to (d) clock times (recording start time, lights-off/in-bed time, sleep onset, sleep offset, lights-on/out-bed time, recording end time), (e) periods (recording period, in-bed period, sleep period, sleep onset latency), and (f) durations (sleep duration = sleep period—wake after sleep onset (WASO), wake after sleep onset = sleep period—sum of sleep durations within the sleep period). Note that other iterations could further distinguish stages (N1, N2, NREM, and REM) within states.
Standardization of data represented within the polysomnogram, including channel labels.
By definition, the principal indices used to classify sleep-related physiological disturbances rely on the identification and quantification of events annotated from the polysomnography recordings. As described above, the variation in data collection, annotation, and scoring approaches introduce considerable heterogeneity. During its initial phase, NSRR’s computer scientists and biomedical engineers developed several signal processing tools, tailored to working with NSRR data. Tool development was informed by the needs of the local team as well as feedback from the User Community solicited during community outreach events. The NSRR team has further developed a robust signal processing pipeline for sleep data that can be applied both to existing and new NSRR datasets, as well as users’ own sleep studies uploaded to the cloud. Details of these tools will be reported in a subsequent publication, but include:
EDF Annotation Translator: this provides the framework for reading annotations stored in multiple file formats such as XML, CSV, and text files, and transforms them to a standard XML file format with Sleep Resource Ontology concepts for defining the events.
Altamira: an EDF Viewer allows the display of signals and standardized annotations
Luna (http://zzz.bwh.harvard.edu/luna/): is a C/C++ toolset and R extension library for the manipulation and analysis of large numbers of EDFs, designed with both parallelization and working with NSRR annotation data in mind; it can also be deployed as a Docker container, to facilitate migration to the cloud computing environment. These tools support an NSRR analytical pipeline (NAP) that identifies primary signals and annotations; re-labels polysomnograms using canonical labels; and provides a standardized “NSRR” version of data that has been re-referenced and re-sampled to a common standard. These tools include a series of semi-automated checks on incoming data, and outputs a more technically uniform set of signal and annotation files. For example, we employ steps to (1) identify and potentially fix technical issues with EDFs, (2) flag noisy, flat or duplicate signals, (3) check EEG polarities, (4) check the consistency and alignment of stage annotations with the signal data, and potentially fix misaligned staging data, (5) automatically relabel channels and annotations, potentially re-referencing, resampling or rescaling signals as needed, and dropping redundant or undocumented channels, and (6) generate a battery of statistics summarizing sleep macro- and micro-architecture, with a focus on the EEG.
A challenge in analyzing sleep signal data relates to a lack of standards or requirements that could be used to indicate data are of sufficient quality for supporting specific, or a set of broad, applications. The NSRR team prioritizes data modifications aimed at enhancing usability—such as making physical units, sampling rates, file formats, or channel nomenclatures similar between and within studies. This approach deliberately avoids altering specific information content to achieve a particular minimum standard, recognizing that the appropriateness of such standards varies according to the specific research question and analytical methods employed. For example, standards that flag a given recording suitable for one analysis (e.g. examining spectral properties of the stable NREM EEG) may not apply to others (e.g. studying sleep onset or the relationship between sleep and circadian factors). Future work may include developing an array of diagnostic metrics and annotating these for their relative applicability for different purposes. However, ultimately, decisions related to data quality need to be made by the researchers who best understand their specific research questions.
Harmonization Steps
The process of data harmonization focuses on the specification of homogenized phenotypes that can be used to identify and characterize potentially comparable variables abstracted from different datasets, as exemplified by the work of the Trans-Omics for Precision Medicine (TOPMed) initiative [53]. While we were able to utilize resources provided by the TOPMed and BioData Catalyst projects to harmonize a range of non-sleep variables in NSRR datasets (including age, sex, race, ethnicity, smoking status, body mass index, and blood pressure), we recognized that the inherent complexity of device-based sleep data would make it difficult to develop integrated functions capable of accurately harmonizing sleep research phenotypes [54, 55]. Towards that end, we engineered a unique approach to the harmonization of polysomnography and polygraphy variables that leveraged the degree of specificity afforded by our compositional terminology. This approach progressed through the following iterative stages, as exemplified by harmonization efforts for sleep-disordered breathing variables:
Specification of target phenotypes—Candidate phenotypes were reviewed to identify commonly used terms (e.g. the AHI), as supported by their citation in published guidelines and use in the research literature.
Characterization of heterogeneity—Data generation and acquisition processes were reviewed to determine which study-level and variable-level metadata elements contributed most significantly to the heterogeneity between related but distinct target phenotypes. Potential sources of heterogeneity included sleep acquisition procedures, and for hypopnea and apnea terms, included (1) airflow reduction thresholds, (2) oxygen desaturation thresholds, and (3) the presence or absence of arousal(s).
Refinement of target phenotypes—Practical considerations prompted us to limit the degree of granularity required to specify target phenotypes. For example, although we considered basing definitions on the four level AASM classification of sleep apnea monitoring devices, we categorized the sleep device types into those that include or do not include EEG data. We limited definitions of thresholds of flow reduction to levels that could be mapped to specific AASM guidelines. By doing so we were able to identify 13 permutations of sleep-disordered breathing events by combining study types, flow reduction thresholds, and event definitions at 3% and 4% oxygen desaturation thresholds that we were able to consolidate to generate 7 AHI phenotypes and 3 REI phenotypes.
Mapping compositional tags to target phenotypes—We used our compositional terminology scheme to assign metadata tags to each phenotype to generate harmonized terms. Checks were conducted to confirm that each mapped compositional tag corresponded to a mutually exclusive AHI or REI phenotype (Supplementary Figure S2).
Identification of candidate variables—Queries were conducted using harmonized terms to identify candidate variables in each dataset based on mapped compositional tags originally assigned during the ingestion and curation of each dataset added to the NSRR.
Generation of harmonized variables—Each retrieved candidate variable was further evaluated to determine the degree to which it matched the specification of a harmonized term. When a candidate variable was deemed to be an appropriate match, a new version marked as a harmonized variable was added to the dataset with a link to the harmonized term that could be used to trigger a cross-cohort query to identify similarly harmonized variables present in other datasets.
We conduct this review on a regular basis in the course of processing and curating datasets added to the NSRR. External investigators may also follow this approach for determining if selected variables may be candidates for harmonization.
Examples of Data Harmonization in NSRR
The following examples show the results of harmonization efforts to describe variation in sleep metrics across age and gender. Overnight EEG data from a total 25 678 studies (14 618 male, 11 060 female), ages 2.5–90 years, were reprocessed using the Luna pipeline, harmonizing channel labels, polarity, removing artifact, and resampling at standard rates. Figure 5a and b shows the clear reduction in N3 sleep density and increase in sleep fragmentation index across age, and evident gender differences.

(a) and (b) Sleep architecture across the lifespan and by gender in NSRR. (a) Stage N3 density (minutes of N3 sleep divided by total sleep period time, SPT); (b) sleep fragmentation index. Data from 26 673 individuals selected from the NSRR with polysomnography data, aged 2.5–90 years (57% male), from 13 cohorts (APPLES, CCSHS, CFS, CHAT, MESA, MNC, MrOS, MSP, NCHSDB, SHHS, SOF, STAGES, and WSC). Blue: male. Red: female.
In contrast, Figure 6 shows the results of efforts to harmonize the AHI. Data mapping efforts allowed unambiguous assignment of specific definitions across key variations in AHI values, demonstrating that at any age, AHI values are considerably highest when the 1999 Chicago criteria are applied (including hypopneas with a 50% reduction in amplitude with a 3% desaturation or arousal), and lowest for the AASM 2015 definition (which requires a 30% amplitude reduction and 4% desaturation to accompany hypopneas). Unlike the approach to analysis of the EEG data where the raw data were directly reprocessed, the AHI metrics were based on events that were annotated by the data contributors. In many datasets, events were annotated with a restricted number of features (e.g. hypopneas only identified for a fixed desaturation), limiting the ability to generate alternative metrics that mapped to a single common definition, or to key definitions used over time. Future harmonization will benefit from directly reprocessing the raw respiratory signals and applying standardized automated algorithms to all datasets.
![Variations of alternative mapped apnea hypopnea indices, by age, n = 18 287. Mean AHI values by age group, according to four mapped alternative definitions of the AHI, using data from 13 cohorts (ABC, APPLES, BestAIR, CFS, CHAT, HomePAP, MESA, MrOS, MSP, NuMom2b, SHHS, SOF, and WSC): nsrr_ahi_chicago1999: apnea-hypopnea index (all apneas + hypopneas with > 50% flow reduction or discernible flow reduction with >=3% desat or arousal) per hour of sleep. Harmonized by the NSRR team. The definition of hypopnea events is consistent with the following clinical guidelines: American Academy of Sleep Medicine (AASM) Chicago 1999 standard. nsrr_ahi_hp3r_aasm15: Apnea-Hypopnea Index (all apneas + hypopneas with >=30% nasal cannula [or alternative sensor] reduction and >= 3% oxygen desaturation or with arousal) per hour of sleep. Harmonized by the NSRR team. The definition for hypopneas is consistent with the following clinical guidelines: (1) American Academy of Sleep Medicine (AASM) 2007 Manual (2012 update) (recommended), and (2) American Academy of Sleep Medicine (AASM) 2015 (recommended). nsrr_ahi_hp4r: Apnea-Hypopnea Index (all apneas + hypopneas with >= 4% oxygen desaturation or with arousal) per hour of sleep. Harmonized by the NSRR team. nsrr_ahi_hp4u_aasm15: Apnea-Hypopnea Index (all apneas + hypopneas with >=30% nasal cannula [or alternative sensor] reduction with >= 4% oxygen desaturation) per hour of sleep. Harmonized by the NSRR team. The definition of hypopnea events is consistent with the following clinical guidelines: (1) AASM 2012 update (alternative) and (2) AASM 2015 (acceptable).](https://oup.silverchair-cdn.com/oup/backfile/Content_public/Journal/sleep/47/7/10.1093_sleep_zsae088/1/m_zsae088_fig6.jpeg?Expires=1749020500&Signature=RtU4SLE4CplO1RB5kgtTWG46Pe5ye5~mL9I1Y8ctZ0HaPA4hOR6KqazgFlF04sWGcRrMSe5Tg6LT7Y4t4Tf87hw0qzumoQuoh81t4L~u812lLy3yx1g5Sq-N4XXq5Iz6hPCZ7V-Ik3qdXxlhMa6q2jdqdkW7XvfIbyqdnPRvYyceHhpRVsNKbDu67cpsl3VOT1pnC~PQAdo7MsfQV2cJW3bvxXLW6oHSYwHV8979vpVEFtjV9PFwnLwt0WhS6nvTb3efLahP2q-D5JGYqPHpDSJ9-0hiStYw4Cm~VNGQSWrGTLlCnhquX6usUMtAAUnQEvYeLy5MPpbg~aX2b-r8wQ__&Key-Pair-Id=APKAIE5G5CRDK6RD3PGA)
Variations of alternative mapped apnea hypopnea indices, by age, n = 18 287. Mean AHI values by age group, according to four mapped alternative definitions of the AHI, using data from 13 cohorts (ABC, APPLES, BestAIR, CFS, CHAT, HomePAP, MESA, MrOS, MSP, NuMom2b, SHHS, SOF, and WSC): nsrr_ahi_chicago1999: apnea-hypopnea index (all apneas + hypopneas with > 50% flow reduction or discernible flow reduction with >=3% desat or arousal) per hour of sleep. Harmonized by the NSRR team. The definition of hypopnea events is consistent with the following clinical guidelines: American Academy of Sleep Medicine (AASM) Chicago 1999 standard. nsrr_ahi_hp3r_aasm15: Apnea-Hypopnea Index (all apneas + hypopneas with >=30% nasal cannula [or alternative sensor] reduction and >= 3% oxygen desaturation or with arousal) per hour of sleep. Harmonized by the NSRR team. The definition for hypopneas is consistent with the following clinical guidelines: (1) American Academy of Sleep Medicine (AASM) 2007 Manual (2012 update) (recommended), and (2) American Academy of Sleep Medicine (AASM) 2015 (recommended). nsrr_ahi_hp4r: Apnea-Hypopnea Index (all apneas + hypopneas with >= 4% oxygen desaturation or with arousal) per hour of sleep. Harmonized by the NSRR team. nsrr_ahi_hp4u_aasm15: Apnea-Hypopnea Index (all apneas + hypopneas with >=30% nasal cannula [or alternative sensor] reduction with >= 4% oxygen desaturation) per hour of sleep. Harmonized by the NSRR team. The definition of hypopnea events is consistent with the following clinical guidelines: (1) AASM 2012 update (alternative) and (2) AASM 2015 (acceptable).
Needs and Future Directions
The impact of the data, tools, and outreach efforts of the NSRR on sleep and circadian research is evident by its support of thousands of researchers across a wide spectrum of backgrounds and from around the world, its contributions to hundreds of manuscripts, its role in the development of numerous novel sleep scoring algorithms and scientific discovery of novel sleep predictors, and its support of early as well as later stage investigators who have accessed NSRR data to generate preliminary data for grant applications or have used NSRR data as primary data for academic purposes. There are, however, several areas that can be enhanced:
Education—Informatics and data sharing policies are each rapidly evolving. Ongoing education and support of a range of stakeholders is needed to ensure there is understanding of the value and approaches for collecting, archiving, labeling, sharing, and analyzing the rich sleep and circadian data increasingly generated by sleep laboratories and research studies.
Data coverage—While the NSRR continues to ingest data and is expanding to include data from animal and circadian study designs, there are data gaps. Notably, the data within NSRR are limited to those elements that contributors are willing to share. Some data are not shared due to proprietary concerns by contributors; in other cases, data other than the sleep data had been made available to other repositories (e.g. BioLINCC) and are only accessible by securing permissions for those repositories to cross-link with the data within the NSRR. There are continued challenges in harmonizing sleep-related data that are collected without strong standardization. The harmonization work can be labor intensive (e.g. relying on vetting of potentially harmonizable variables to identify common terms or using compositional terminology components to generate accurate and matchable labels). Implementation of NIH’s Data Management and Sharing Policy that requires the scope and format of data sharing to be described as part of the grant submission process should both facilitate and incentivize the sharing of larger amounts of data. The NSRR is in the process of depositing data into NHBLI’s BioData Catalyst repository, which will make cross-linking with other data, including genetic data, in overlapping cohorts easier. Ultimately, the constraints on big data opportunities for the sleep and circadian field relate to availability of sleep and circadian data linked to the broad range of social determinants of health, clinical, outcome, and molecular data needed to drive transformative science. NIH and other sponsors need to periodically assess the availability of such data and invest in prospective data collection to fill critical gaps.
Lack of standards/burden of data harmonization and mapping procedures—This paper discussed multiple sources of heterogeneity inherent in current clinical and research data collection protocols. While the NSRR developed innovative approaches to reducing or addressing this heterogeneity, the sleep and circadian fields need to push for more rigorous up-front standardization of data collection and archival procedures, event and disorder definitions, and metadata, that in aggregate will simplify data sharing efforts and improve the quality of harmonized datasets. Adoption and dissemination of core sleep and circadian Common Data Elements will require collaboration of domain experts, informaticians, and clinicians in the development of standards, with ongoing work to ensure that such standards are updated and used appropriately. Adoption of standards needs to expand beyond event definitions and disease definitions to include standardization across multiple levels of data collection, including nomenclature, pre-processing steps, numerical standards and data, and metadata formats. Professional societies may recommend that research data utilize a set of standards that includes machine readable formats. Societies representing clinical sleep medicine can make sleep laboratory accreditation dependent on use of data standards to allow data to be readily queried as well as shared (after appropriate de-identification).
Open-source tools—reproducible research rests on a pillar of shared, documented, and robust computational tools. There needs to be clarity regarding how to balance issues related to intellectual property, commercialization, and scientific rigor. Requiring investigators to share code (or predictive models) will improve both the rigor and transparency of research. However, there is wide variability in how code is documented and updated. Often there is no ongoing support to ensure the developers can respond to user questions or identified bugs.
Emergence of “profit-based” or restricted data repositories—Finally, much of the rapid rise in web-based commercial entities (Google, Apple, etc.) is based on the commercial value of aggregating and leveraging individual-level data. Sleep data have attracted commercial interest due to the potential for those data to inform: (1) the targeted development of products aimed at a $40 billion “sleep-health/wellness” market; (2) development of commercial algorithms for improved quantification of sleep-related parameters and sleep-related devices; and (3) development of sleep-focused interventions. In addition to the ethical and privacy concerns related to the commercial use of individual-level data, these commercial interests may drive the development of restricted sleep data repositories that may compete with more generally accessible repositories and limit the community’s ability to engage in open discovery and competition. Non-commercial but restricted access to aggregated data also occurs when academic groups prevent data sharing to protect their own intellectual property, and similarly constrains the potential of “big” sleep data analyses as an open, community knowledge source.
Summary
While there are many challenges for sleep and circadian data sharing and harmonization, work by the NSRR suggests the utility of several novel approaches and demonstrates that heterogenous and valuable data can be readily shared to support a wide range of research and algorithmic development. Moreover, much of the work related to data harmonization can inform data sharing, metadata, and Common Data Element development in other domains. With further data sharing and standardization, the field will move closer to its vision of utilizing large datasets and powerful tools including machine learning to enhance scientific discovery and productivity, statistical power, rigor, and reproducibility, to ensure that the discoveries for sleep and circadian science are applicable to diverse populations. While community-oriented efforts such as those pioneered by the NSRR progress, there also will be a need to carefully consider the roles of restricted commercial and non-commercial efforts in complementing or competing with “open” data sharing efforts, including NIH’s roles in supporting these efforts, and the types of permissions and safeguards needed to ensure ethical, privacy and intellectual property needs are appropriately addressed.
Acknowledgments
This work was supported by National Heart Lung and Blood Institute grants NHLBI R24 HL114473 and contract 75N92019C00011 to SR.
Funding
Financial Disclosure: SR reports consulting fees from Jazz Pharmaceuticals, Eli Lilly, and ApniMed Inc unrelated to this manuscript, and grants from NIH that supported this work.
Non-financial Disclosure: none.
Data Availability
The data underlying this article is available in the NSRR, at sleepdata.org. The individual-level data is available through application at the NSRR.
References
Author notes
Ying Zhang and Matthew Kim to this paper as co-first authors.
Comments