-
PDF
- Split View
-
Views
-
Cite
Cite
Christophe Gauld, Kévin Ouazzani, Jean-Arthur Micoulaud-Franchi, To split or to lump? Classifying the central disorders of hypersomnolence: sleep split requires epistemological tools and systematic data-driven conceptual analysis, Sleep, Volume 43, Issue 9, September 2020, zsaa091, https://doi.org/10.1093/sleep/zsaa091
- Share Icon Share
The article entitled “To split or to lump? Classifying the central disorders of hypersomnolence” by Fronczek et al., published in the journal Sleep in March 2020 [1], proposes a new classification for Central Disorders of Hypersomnolence (CDH) purported to be based on current scientific evidence and going a step beyond the International Classification of Sleep Disorders (ICSD-3). The authors suggest that both “narcolepsy with cataplexy” and “idiopathic hypersomnia with long sleep time” could be considered as separate entities (splitting process), and that “narcolepsy without cataplexy” and “idiopathic hypersomnia without long sleep time” could be combined into a unique disorder (lumping process). This is an important contribution on a subject much debated in the scientific community, due to the absence of robust classification in the context of hypersomnolence. However, the construction of a classification requires solid theoretical, epistemological, and methodological tools that have been developed in the field of medical classification for more than 30 years and should not be overlooked [2]. We would like to make a conceptual contribution to this debate in order to avoid some of the pitfalls involved in classifying pathologies. We propose an analysis of the mooted classification that, in our opinion, should be based on both a solid theoretical/epistemological framework and a rigorous data-driven methodology.
Theoretical and Epistemological Framework for Sleep Disorder Classification
Fronczek et al. propose splitting or lumping the diagnostic entities, a process well known in epistemology and theoretical science for a century and a half and first suggested by Darwin in a letter to Hooker in 1857 (“It is good to have hair-splitters and lumpers”) [3]. Splitting and lumping are basically strategies for producing scientific taxonomies. These concepts have been used widely in the field of biology and have been proposed in medicine to answer the question: what is the methodology for grouping (i.e. lumping) or separating (i.e. splitting) classes of medical conditions?
Splitting and lumping are based on the premise that the world is organized in structures. However, entities are similar to or distinct from one another in many ways. Classically, epistemology serves to enlighten the process of classification and is underpinned by the concepts of “natural kind” and “practical kind.” “Natural kind” supposes regularities in phenomena that exist, irrespective of our wishes or preferences. Natural kinds can be defined by identifying biological anomalies. Fronczek et al. split narcolepsy with cataplexy (type 1), because this kind of pathology can be defined by biological anomalies that cause them, such as a lack of orexin. Thus, the split of narcolepsy with cataplexy is conceptually justified because this type of narcolepsy can be considered as a “natural kind” with a precise mechanistic structure. However, current scientific evidence does not substantiate such a mechanistic bottleneck biological marker for other CDH, so classification of this kind remains a challenge. The epistemology of classification provides two theoretical frameworks to meet such a challenge.
First, according to the natural kind theory, other CDH could be thought of in terms of the mechanistic property cluster (MPC) [4, 5]. Such a multi-level cluster (environment, symptom, behavior, physiology, genetic) constitutes a natural kind when the co-occurrence of the properties in the cluster can be explained by a similar mechanism that generates them, which can then be generalized. It is only after performing this conceptual “dissection,” which describes the structure of different CDH, that it is possible to determine whether we can split or lump categories. As Craver puts it: “If you find that a single cluster of properties is explained by more than one mechanism, split the cluster into subset clusters”; “If you find that two or more putatively distinct kinds are explained by the same mechanism, lump the putative kinds into one” [6].
Second, some medical conditions can also be classified by using the concept of “practical kinds.” This refers to the variety of the decisions we make in order to classify an indeterminate world, especially a specific kind that can prove useful in practice [7]. In our opinion, the decision (1) to lump “idiopathic hypersomnia without long sleep time” together with “narcolepsy without cataplexy” and (2) to split “idiopathic hypersomnia with long sleep time,” as proposed by the authors, is in fact based principally on practical kind, notably because sleep medicine laboratories evaluate sleep with tools such as the Multiple Sleep Latency Test and ad lib PSG protocols. The emphasis that the authors place on the various procedures to perform ad lib PSG protocols is completely in keeping with this practical consideration.
Data-Driven Methodology for Modifying Sleep Disorder Classification
Prioritizing natural or practical kinds over the reality of CDH has profound implications. Indeed, repositioning an entity within a classification is not only a terminological adaptation: it also requires the reassessment of concepts, interests, therapeutic options, reimbursement arrangements, research funding, and more generally, sociological views about it. Separating or combining categories from a classification without establishing the kinds of sleep disorders may lack robustness. Thus, a scientific evidence-based data-driven classification is of value only if an exhaustive literature review is conducted to support the conceptual analysis of the kinds and to identify the main clinical and scientific domains of analysis and explanation of the different CDH categories. Yet, with the exponential increase in the number of articles published every year in the biomedical domain, there is a need to build automated systems to extract unknown information from the articles published [8]. Indeed, manual procedures for text classification work well for up to a few hundred documents. When the number of documents is larger, however, manual procedures become laborious, time-consuming, and potentially unreliable [9].
As Haendel et Chesler wrote in 2012 [10], “Bioinformatics is a growing discipline that has emerged from the practical needs of modern medicine.” It allows us to provide systematics organizations of scientific knowledge. This diverse and interdisciplinary field spans the fundamentals of database development and data integration, taking classification constraints into account. For classification in medicine, this provides a fundamental shift in our ability to associate and dissociate concepts and relate nosological entities [10], which gains to be adopted by sleep medicine research. Among these bioinformatics methodologies, text mining techniques enable the extraction of unknown knowledge from the exponential number of articles published [9]. The purpose of this letter is to encourage researchers in sleep medicine to explore this kind of technique, which is bound to be developed in the coming years and for future classifications.
We have built an example of networks of “Idiopathic Hypersomnia” [Mesh] (91 articles) and “Narcolepsy” [Mesh] (5560 articles) by text-mining analyses and topic modeling (Natural Language Processing – Node / network = 150). All the data, the code, and the figures of our analyses can be found in open access on: https://osf.io/d37fe/. To our knowledge, this is the first data-driven conceptual analysis of CDH. These techniques facilitate the automatic assignment of text strings to categories, “making classification expedient, fast, and reliable, which creates potential for its application in organizational research and nosological methodology” [8]. Such an analysis allows us to go beyond the implicit conception and theory of clinicians and researchers and to confront it with the agnostic analysis of the data-driven exploration of knowledge. Such an analysis is a useful example of the possibilities of automated systems to extract principal explanations and practical domains from published articles and paves the way for splitting or lumping categories. It offers the means to extract and explain mechanistic properties and practical constraints for constructing a more robust classification and can help in choosing the most appropriate terminological adaptation of a given category.
In conclusion, we would like to emphasize that the process of “split-lump-driven” nosographic categorization can be situated between rigorously theoretical tools and scientific evidence-based data-driven methodological approaches in the quest to develop a robust classification of CDH.
Funding
None declared.
Conflict of interest statement. None declared.
Comments