Abstract

Background

The latest review of studies on multimorbidity patterns showed high heterogeneity in the methodology for identifying groups of multimorbid conditions. However, it is unclear how analytical methods used influence the identified multimorbidity patterns.

Methods

We undertook a systematic review of analytical methods used to identify multimorbidity patterns in PubMed and EMBASE from their inception to January 2017. We conducted a comparison analysis to assess the effect the analytical methods had on the multimorbidity patterns identified, using the Australian National Health Survey (NHS) 2007–08 data.

Results

We identified 13 194 studies and excluded 13 091 based on titles/abstracts. From the full-text reviews of the 103 remaining publications, we identified 41 studies that used five different analytical methods to identify multimorbid conditions in the studies. Thirty-seven studies (90%) adopted either the factor-analysis or hierarchical-clustering methods, but heterogeneity arises for the use of different proximity measures within each method to form clusters. Our comparison analysis showed the variation in identified groups of multimorbid conditions when applying the methods to the same NHS data. We extracted main similarities among the groupings obtained by the five methods: (i) cardiovascular and metabolic diseases, (ii) mental health problems and (iii) allergic diseases.

Conclusion

We showed the extent of effects for heterogeneous analytical methods on identification of multimorbidity patterns. However, more work is needed to guide investigators for choosing the best analytical method to improve the validity and generalizability of findings. Investigators should also attempt to compare results obtained by various methods for a consensus grouping of multimorbid conditions.

Key Messages
  • Identification of multimorbid conditions beyond chance is important in multimorbidity research, as the findings not only generate new hypotheses on possible shared biologic processes among specific diseases, but also facilitate quantifying the effect of multimorbidity on health-related outcomes.

  • Previous reviews showed substantial variation in methodologies in studies for identifying the nature and patterns of non-random multimorbidity. However, questions remain with regard to how the analytical methods used in these studies address non-random association between conditions and the extent of their effects on the grouping of multimorbid conditions.

  • We reviewed the analytical methods in epidemiological studies on multimorbidity and identified five different methods to reveal multimorbid conditions in these studies. Within each method, heterogeneity of findings arises due to the use of different proximity measures to form clusters.

  • In the comparison analysis, we used the Australian National Health Survey data to illustrate the extent of effects for heterogeneous analytical methods on identification of multimorbidity patterns. Main similarities among the clusters obtained fall into three disease groups.

  • These findings will contribute to develop a more uniform methodological framework to guide investigators for choosing the analytical method to enhance the validity and generalizability of findings in future multimorbidity research.

Introduction

Multimorbidity (the co-occurrence of two or more health conditions within one person1) constitutes a serious burden and challenge on the healthcare system in many countries, as it is closely associated with poorer health outcomes, more complex clinical management, a higher use of health services and associated cost.2–6 Based on the 2004–05 Australian National Health Survey (NHS), 80% of Australians aged ≥65 years have three or more chronic conditions. A recent study estimated that more than 210 000 Australians aged ≥55 years spend more than 20% of their income on health and healthcare.7 High prevalence rates of multimorbidity are also observed in other countries such as Canada and the USA.8,9 Previous studies10,11 suggested that social disadvantage enhanced the likelihood of illness through increased exposure to environments that have a detrimental effect on health, such as poor diet and care, adverse family conditions and exposure to smoking, drinking and psychosocial stress.

Although multimorbidity is increasingly recognized as an important issue in medical care, most clinical studies have used a single disease-based paradigm with clinical trials typically excluding participants with multimorbidity.12–14 In addition, only a minority of clinical guidelines in Australia and the USA make specific recommendations about the management of individuals with multiple conditions.2 The vast majority of services in tertiary care settings also largely focus on single conditions despite high rates of multimorbidity among populations attending such services. Multimorbidity research has the potential to play an important role addressing these limitations of existing services and shift the focus of health services away from a focus on single conditions towards the development of more effective strategies to better help the large numbers of individuals with multimorbid conditions.15,16

There are two major approaches in epidemiological studies on multimorbidity. The first approach investigates multimorbid conditions with a specific index condition, such as attention deficit hyperactivity disorders (ADHDs) in childhood.17 From this perspective, comorbidity is the term to that it is usually referred to.1 The comorbidity study approach allows studying specific health problems on selected subpopulations of interest. The second approach emphasizes the quantification of ‘non-random’ multimorbidity and differences in multimorbidity patterns among individuals in the general population. Identification of multimorbid conditions beyond chance is important in multimorbidity research, as the findings not only generate new hypotheses on possible shared biologic processes or pathophysiological pathways among specific diseases,12,18 but also facilitate quantifying the impact of multimorbidity on health-related outcomes and quality of life.19–21 The latter is essential to provide the evidence base for designing future studies to improve prevention, treatment and care of patients with multiple conditions. A recent Cochrane review of the effectiveness of interventions to improve outcomes for people with multimorbidity22 showed that a variety of interventions conducted in this area to date had little improvement on outcomes, especially on health-service use and medication adherence. There are remaining uncertainties about how to characterize multimorbidity and assess its impact on health outcomes. Whilst a number of large-scale national surveys have been conducted worldwide to study important health problems associated with multimorbidity, the synergistic and interactive effects with other key risk factors on multimorbidity patterns as well as the effect on health outcomes remain unclear.23

To assess and quantify the impact of multimorbidity on individuals’ health-related outcomes and service use, there is a need to obtain an appropriate measure of multimorbidity to characterize individuals’ multimorbidity patterns.24 Most existing measures use a single score to quantify the multimorbidity of an individual, such as the sum of the number of conditions and weighted scores including the Charlson Multimorbidity Index and Cumulative Illness Rating Scale.24–27 These numerical indices, however, do not account for multimorbidity by chance28 and may suffer generalizability problems, as they either depend on the number of conditions under study or they were originally developed and validated for an index condition or for specific outcomes.25,29 Previous reviews and studies demonstrated that certain combinations of multimorbid conditions have greater synergistic effects on specific health outcomes and service uses, compared with multimorbidity involving other conditions,30–32 implying that the sum of the number of diagnosed conditions widely used to adjust for individual multimorbidity in outcome analyses is not an appropriate measure.33

The latest systematic review of epidemiological studies on multimorbidity patterns published before June 2012 showed the diversity of the nature and patterns of multimorbidity found across different studies.12 This finding may be explained by different methodological approaches used to identify multimorbid condition groups, select, define and code diseases, as well as different population samples used in those analyses. As different analytical methods adjust for multimorbidity by chance to different extents, it is anticipated that the reported prevalence of multimorbidity, and hence multimorbid groups of conditions from different studies, also vary.34

The objective of this systematic review was to identify the analytical methods used in studies for the identification of multimorbid condition groups and understand how each method addresses the potential for non-random association. Moreover, we conducted a comparison analysis of the identified analytical methods using the Australian NHS 2007–08 data, where the relative impact on the multimorbidity patterns that was attributed entirely to the methods were investigated as other key factors such as selected conditions and population samples were the same for all methods. This review will improve our knowledge about the nature and diversity of the analytical methods used in various studies. The comparison analysis will inform the extent of effects for heterogeneous analytical methods on the identified multimorbidity patterns. These findings will contribute to developing a methodological framework to guide investigators for choosing the best analytical method to enhance the validity and generalizability of findings, reporting multimorbidity patterns and advancing the evidence base around the non-random associative multimorbidity of health conditions. The ultimate goal is to address increasingly complex health needs due to multimorbidity, by developing health management and guidelines for the effective and efficient prevention, treatment and care of patients with multiple conditions as well as improved navigation of the health services for the patients.

Methods

Systematic search strategy

We conducted a systematic search in PubMed and EMBASE electronic databases from their inception to January 2017 to identify relevant studies. In PubMed, we searched via Medical Subject Headings (MeSH) by combining MeSH terms relevant to our topic with Boolean operators. We specified multimorbidity and cluster analysis as MeSH headings and combined search terms synonymous with multimorbidity as well as keywords, such as coexisting health conditions, multiple diagnoses, concurrent diagnoses, etc. We conducted the search strategy in EMBASE using the index terms listed above via the EMBASE EMTREE thesaurus. Supplementary Tables 1 and 2, available as Supplementary data at IJE online, show the detailed search strategies in PubMed and EMBASE.

Inclusion criteria

Studies eligible for the systematic review must be original (written in English) and indexed in PubMed and EMBASE, wherein the basis of research is the identification of patterns of multimorbid groups of health conditions with explicit emphasis on the method(s) used for exploring multimorbidity patterns.

Exclusion criteria

Studies were excluded per the following criteria:

  1. applied descriptive measures of multimorbidity, e.g. based on prevalence or count of health conditions;

  2. studied association between health conditions by first selecting index conditions;

  3. identified multimorbid clusters based on health conditions fewer than 10 in number;

  4. studies focusing on grouping patients (or participants) only instead of conditions.

Data extraction and quality assessment

We extracted the name of the first author, publication year, title, journal and abstract for each paper during the screening, as well as the sample size, number of health conditions, statistical method, clustering algorithm, proximity measure and method to determine the number of groups for each included study. The quality of reporting of the included studies was assessed using a shorter version of the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) checklist,26 in which 23 items relevant to reports of cross-sectional studies were included. An overall quality score for an included study was determined by the total number of STROBE items the study addressed (ranging from 0 to 23). The selection of included studies, full-text reviews, data extraction and quality assessment were conducted independently by two reviewers (S.N. and R.T.).

Comparison analysis

We compared the analytical methods for identifying multimorbid groups of health conditions using the Australian NHS 2007–08 data. The survey collected information about the prevalence of current long-term conditions from 20 788 Australians, where current long-term conditions were defined as medical conditions that were current at the time of the survey and that had lasted at least 6 months, or that in which the respondent expected to last for 6 months or more.35 Medical conditions were classified based on the World Health Organization International Classification of Diseases, Tenth Revision.36 The ABS website provides a summary of findings37 and the NHS data in Confidentialized Unit Record Files (CURFs), which may be accessed via the Remote Access Data Laboratory.38

Results

Literature search results

Using the above systematic search strategy, we identified a total of 13 194 unique papers after removing duplicated records. Among these papers, we excluded 13 091 papers as irrelevant after reviewing the papers’ titles and abstracts with reference to our inclusion and exclusion criteria. We conducted full-text reviews of the 103 remaining papers. Of these, we excluded further 62 papers because they adopted descriptive measures of multimorbidity, aimed to cluster patients and/or considered association between health conditions with a pre-selected index condition. Table 1 displays the methodological aspects in terms of the analytical method, the proximity measure and the type of clustering algorithm for the identification of multimorbid condition groups in the final inclusion of 41 papers.

Table 1.

Analytical methodology for the identification of multimorbid condition groups (41 studies)

Study, yearNo. of individuals; no. of conditionsStatistical methodClustering algorithmProximity measureDetermination of number of groups
Abad-Díez et al.,58 201472 815; 51Exploratory factor analysisPrincipal-factor method; Oblique (Oblimin) rotationTetrachoric correlation matrixScree plot; factor loadings ≥0.25
Alonso-Moran et al.,59 2015126 889 patients with T2DM; 28Agglomerative hierarchical cluster analysisWard’s minimum-variance methodPearson’s dissimilarityCut-off point of 0.98 imposed on dendrogram
Clerencia-Sierra et al.,60 2015924 patients aged ≥65 years; 59Exploratory factor analysis (stratified by sex)Principal-factor method; rotation method NSTetrachoric correlation matrixScree plot; factor loadings ≥0.25
Cornell et al.,61 20071 645 314; 45Agglomerative hierarchical cluster analysisLance-Williams method with β set at –0.50Jaccard coefficientEvaluated by seven health services researchers relative to three clinical criteria
Dong et al.,62 2013496; 16Agglomerative hierarchical cluster analysis (stratified by sex)Average linkageYule QCut-off between 0.2 and 0.3 imposed on dendrogram
Dorenkamp et al.,63 20163386; 15Agglomerative hierarchical cluster analysisWard’s minimum-variance methodSquared Euclidean distanceAgglomerative coefficient; dendrogram; Pseudo F statistic
Foguet-Boreu et al.,64 2015322 328 aged ≥65 years; 99Agglomerative hierarchical cluster analysis (stratified by age group and sex)Ward’s minimum-variance methodJaccard coefficientSemi-partial R2; Calinski-Harabasz Pseudo F; Pseudo T2 statistic
Formiga et al.,65 2013328 aged 85 years; 16Agglomerative hierarchical cluster analysisAverage agglomerationYule QNS
Gabilondo et al.,66 20172 255 406; 27Agglomerative hierarchical cluster analysis (stratified by sex)Ward’s minimum-variance methodPearson’s dissimilarityCut-off based on clinical importance
García-Olmos et al.,67 2012198 670; 26Multiple correspondence analysis (account for age and sex)Graphic of category points in selected dimensions to indicate ‘importance’Data matrixBenzécri inertia adjustment; inertia >0.05
Garin et al.,68 20143625 aged >50 years; 11Exploratory factor analysisExtraction method NS; Oblique (Oblimin) rotationTetrachoric correlation matrixEigenvalue ≥1; factor loadings >0.25
Garin et al.,69 201641 909 aged >50 years; 12Exploratory factor analysisExtraction method NS, Oblique rotationTetrachoric correlation matrixParallel analysis by simulation; Scree test
Goldstein et al.,70 20103591 homeless veterans; 16Factor analysisPrincipal-component method; Varimax rotationPearson-correlation matrixEigenvalue ≥1
Gu et al.,71 20172452 aged ≥60 years; 13Factor analysisPrincipal-component method; Oblique (Oblimin) rotationTetrachoric correlation matrixEigenvalue >1; factor loadings ≥0.25
Held et al.,54 20161464 aged ≥70 years; 17Network and cluster analysisModularity class analysisForceAtlas2 layout (geometric distance and degree of nodes)Resolution = 0.6
Hermans and Evenhuis,72 20141047 with at least two conditions and aged ≥50 years with ID; 14Exploratory factor analysisMaximum-likelihood method; Varimax rotationTetrachoric correlation matrixCFI; RMSEA; SRMR
Herr et al.,73 2015117 with diagnosed heart failure; 10Factor analysisPrincipal-component method; Oblimin rotationPearson-correlation matrixScree plot
Holden et al.,28 201178 430; 23Exploratory factor analysisExtraction method NS; orthogonal quartimax/quartimin rotationTetrachoric correlation matrixScree plot; Eigenvalue >1.0; SRMR < 0.05; CFI & TLI > 0.95; factor loadings (magnitude) >0.4
Islam et al.,74 20144574; 10Agglomerative hierarchical cluster analysis; principal-component analysisAverage linkage; Varimax rotationYule Q; tetrachoric correlation matrixDendrogram; Agglomerative coefficient; Scree plot; Eigenvalue >1; SRMSR; CFI; TLI; factor loadings (magnitude) >0.3
Jackson et al.,75 20157270 women aged >75 years; 31Exploratory factor analysisPrincipal-factor method; Varimax rotationTetrachoric correlation matrixEigenvalue >1; Scree plot
Jackson et al.,76 20164896 women; 31Exploratory factor analysisPrincipal-factor method; Varimax rotationTetrachoric correlation matrixEigenvalue >1; Scree plot
John et al.,25 20031039 aged ≥60 years; 11Agglomerative hierarchical cluster analysisComplete linkage; average linkagePhi coefficient; Yule QSensitivity analysis; Goodman and Kruskal’s gamma
Jovic et al.,77 201613 103 aged ≥20 years; 13Exploratory factor analysis (stratified by age group and sex)Principal-component method; Varimax orthogonal rotationNSScree plot; Factor loadings >0.25; Eigenvalues >1
Kim et al.,78 20121844 HIV-infected patients; 15Exploratory factor analysisExtraction method NS; oblique rotationTetrachoric correlation matrixEigenvalues; CFL; TLI; RMSEA
Kirchberger et al.,79 20124127 aged ≥65 years; 13Exploratory factor analysisPrincipal-factor method; oblique (Oblimin) rotationTetrachoric correlation matrixEigenvalue ≥1; factor loadings ≥0.25
Kumar et al.,80 20172134 aged ≥50 years; 45Hierarchical cluster analysis using treelet transformationPrincipal-component analysisCorrelationDendrogram; cut level of 12 determined by a 10-fold cross-validation procedure
Magnan et al.,81 201829 562 with types 1 or 2 diabetes; 12Exploratory factor analysisRobust weight least squares; oblique (Geomin) rotationTetrachoric correlation matrixEigenvalue >1; interpretability of factors
Marengoni et al.,20 20091099; 15Agglomerative hierarchical cluster analysisAverage linkageYule QNS
Marengoni et al.,82 20101332 aged ≥65 years; 19Agglomerative hierarchical cluster analysisAverage linkageYule QNS
Marengoni et al.,83 2013n1 = 1155, n2 = 1173; 19Agglomerative hierarchical cluster analysisAverage linkageYule QNS
Ng et al.,34 20128841; 24Three-step cluster analysisSearching all comorbid pairs to form groupsSomers’ D statisticAssociation test using the Benjamini–Hochberg procedure
Nurnberg et al.,84 1991110; 12Factor analysisPrincipal-factor method; orthogonal Varimax rotationCorrelation matrixEigenvalue >1; factor loadings ≥0.4
Poblador-Plou et al.,85 2014n1 = 79 291, n2 = 275 682; 114Exploratory factor analysisPrincipal-factor method; oblique (Oblimin) rotationTetrachoric correlation matrixScree plot; factor loadings ≥0.25
Prados-Torres et al.,86 2012275 682; 13–39 for various age group and sexExploratory factor analysis (stratified by age group and sex)Principal-factor method; oblique (Oblimin) rotationTetrachoric correlation matrixScree plot; factor loadings ≥0.25
Prazeres et al.,87 20151993; 16Agglomerative hierarchical cluster analysisAverage linkageYule QNS
Ruiz et al.,88 20152 788 900 aged ≥65 years; 20Correspondence analysisGraphic of category points in selected dimensions to quantify information explainedData matrixBenzécri inertia adjustment; Greenacre approximation
Schäfer et al.,89 2010149 280 aged ≥65 years; 46Exploratory factor analysis (stratified by sex)Principal-factor method; oblique (Oblimin) rotationTetrachoric correlation matrixEigenvalue ≥1; factor loadings ≥0.25
Sideris et al.,90 2016No. of individuals and conditions NSAgglomerative hierarchical cluster analysisWard’s minimum-variance methodJaccard coefficientDendrogram; cut level of 1.5 determined based on a validation procedure
Vu et al.,91 20113729 with at least two conditions and aged ≥65 years (fall-related injuries); 14Agglomerative hierarchical cluster analysisAverage linkageJaccard coefficient; Yule QCalinski/Harabasz index; Duda/Hart index
Walker et al.,92 20165647 aged ≥55 years; 19Exploratory factor analysisPrincipal-factor method; oblique (Oblimin) rotationTetrachoric correlation matrixScree plot; Eigenvalue ≥1; factor loadings >0.25
Wang et al.,93 20151480 aged ≥60 years; 16Exploratory factor analysisExtraction method NS; oblique (Oblimin) rotationTetrachoric correlation matrixEigenvalue >1; factor loadings ≥0.25
Study, yearNo. of individuals; no. of conditionsStatistical methodClustering algorithmProximity measureDetermination of number of groups
Abad-Díez et al.,58 201472 815; 51Exploratory factor analysisPrincipal-factor method; Oblique (Oblimin) rotationTetrachoric correlation matrixScree plot; factor loadings ≥0.25
Alonso-Moran et al.,59 2015126 889 patients with T2DM; 28Agglomerative hierarchical cluster analysisWard’s minimum-variance methodPearson’s dissimilarityCut-off point of 0.98 imposed on dendrogram
Clerencia-Sierra et al.,60 2015924 patients aged ≥65 years; 59Exploratory factor analysis (stratified by sex)Principal-factor method; rotation method NSTetrachoric correlation matrixScree plot; factor loadings ≥0.25
Cornell et al.,61 20071 645 314; 45Agglomerative hierarchical cluster analysisLance-Williams method with β set at –0.50Jaccard coefficientEvaluated by seven health services researchers relative to three clinical criteria
Dong et al.,62 2013496; 16Agglomerative hierarchical cluster analysis (stratified by sex)Average linkageYule QCut-off between 0.2 and 0.3 imposed on dendrogram
Dorenkamp et al.,63 20163386; 15Agglomerative hierarchical cluster analysisWard’s minimum-variance methodSquared Euclidean distanceAgglomerative coefficient; dendrogram; Pseudo F statistic
Foguet-Boreu et al.,64 2015322 328 aged ≥65 years; 99Agglomerative hierarchical cluster analysis (stratified by age group and sex)Ward’s minimum-variance methodJaccard coefficientSemi-partial R2; Calinski-Harabasz Pseudo F; Pseudo T2 statistic
Formiga et al.,65 2013328 aged 85 years; 16Agglomerative hierarchical cluster analysisAverage agglomerationYule QNS
Gabilondo et al.,66 20172 255 406; 27Agglomerative hierarchical cluster analysis (stratified by sex)Ward’s minimum-variance methodPearson’s dissimilarityCut-off based on clinical importance
García-Olmos et al.,67 2012198 670; 26Multiple correspondence analysis (account for age and sex)Graphic of category points in selected dimensions to indicate ‘importance’Data matrixBenzécri inertia adjustment; inertia >0.05
Garin et al.,68 20143625 aged >50 years; 11Exploratory factor analysisExtraction method NS; Oblique (Oblimin) rotationTetrachoric correlation matrixEigenvalue ≥1; factor loadings >0.25
Garin et al.,69 201641 909 aged >50 years; 12Exploratory factor analysisExtraction method NS, Oblique rotationTetrachoric correlation matrixParallel analysis by simulation; Scree test
Goldstein et al.,70 20103591 homeless veterans; 16Factor analysisPrincipal-component method; Varimax rotationPearson-correlation matrixEigenvalue ≥1
Gu et al.,71 20172452 aged ≥60 years; 13Factor analysisPrincipal-component method; Oblique (Oblimin) rotationTetrachoric correlation matrixEigenvalue >1; factor loadings ≥0.25
Held et al.,54 20161464 aged ≥70 years; 17Network and cluster analysisModularity class analysisForceAtlas2 layout (geometric distance and degree of nodes)Resolution = 0.6
Hermans and Evenhuis,72 20141047 with at least two conditions and aged ≥50 years with ID; 14Exploratory factor analysisMaximum-likelihood method; Varimax rotationTetrachoric correlation matrixCFI; RMSEA; SRMR
Herr et al.,73 2015117 with diagnosed heart failure; 10Factor analysisPrincipal-component method; Oblimin rotationPearson-correlation matrixScree plot
Holden et al.,28 201178 430; 23Exploratory factor analysisExtraction method NS; orthogonal quartimax/quartimin rotationTetrachoric correlation matrixScree plot; Eigenvalue >1.0; SRMR < 0.05; CFI & TLI > 0.95; factor loadings (magnitude) >0.4
Islam et al.,74 20144574; 10Agglomerative hierarchical cluster analysis; principal-component analysisAverage linkage; Varimax rotationYule Q; tetrachoric correlation matrixDendrogram; Agglomerative coefficient; Scree plot; Eigenvalue >1; SRMSR; CFI; TLI; factor loadings (magnitude) >0.3
Jackson et al.,75 20157270 women aged >75 years; 31Exploratory factor analysisPrincipal-factor method; Varimax rotationTetrachoric correlation matrixEigenvalue >1; Scree plot
Jackson et al.,76 20164896 women; 31Exploratory factor analysisPrincipal-factor method; Varimax rotationTetrachoric correlation matrixEigenvalue >1; Scree plot
John et al.,25 20031039 aged ≥60 years; 11Agglomerative hierarchical cluster analysisComplete linkage; average linkagePhi coefficient; Yule QSensitivity analysis; Goodman and Kruskal’s gamma
Jovic et al.,77 201613 103 aged ≥20 years; 13Exploratory factor analysis (stratified by age group and sex)Principal-component method; Varimax orthogonal rotationNSScree plot; Factor loadings >0.25; Eigenvalues >1
Kim et al.,78 20121844 HIV-infected patients; 15Exploratory factor analysisExtraction method NS; oblique rotationTetrachoric correlation matrixEigenvalues; CFL; TLI; RMSEA
Kirchberger et al.,79 20124127 aged ≥65 years; 13Exploratory factor analysisPrincipal-factor method; oblique (Oblimin) rotationTetrachoric correlation matrixEigenvalue ≥1; factor loadings ≥0.25
Kumar et al.,80 20172134 aged ≥50 years; 45Hierarchical cluster analysis using treelet transformationPrincipal-component analysisCorrelationDendrogram; cut level of 12 determined by a 10-fold cross-validation procedure
Magnan et al.,81 201829 562 with types 1 or 2 diabetes; 12Exploratory factor analysisRobust weight least squares; oblique (Geomin) rotationTetrachoric correlation matrixEigenvalue >1; interpretability of factors
Marengoni et al.,20 20091099; 15Agglomerative hierarchical cluster analysisAverage linkageYule QNS
Marengoni et al.,82 20101332 aged ≥65 years; 19Agglomerative hierarchical cluster analysisAverage linkageYule QNS
Marengoni et al.,83 2013n1 = 1155, n2 = 1173; 19Agglomerative hierarchical cluster analysisAverage linkageYule QNS
Ng et al.,34 20128841; 24Three-step cluster analysisSearching all comorbid pairs to form groupsSomers’ D statisticAssociation test using the Benjamini–Hochberg procedure
Nurnberg et al.,84 1991110; 12Factor analysisPrincipal-factor method; orthogonal Varimax rotationCorrelation matrixEigenvalue >1; factor loadings ≥0.4
Poblador-Plou et al.,85 2014n1 = 79 291, n2 = 275 682; 114Exploratory factor analysisPrincipal-factor method; oblique (Oblimin) rotationTetrachoric correlation matrixScree plot; factor loadings ≥0.25
Prados-Torres et al.,86 2012275 682; 13–39 for various age group and sexExploratory factor analysis (stratified by age group and sex)Principal-factor method; oblique (Oblimin) rotationTetrachoric correlation matrixScree plot; factor loadings ≥0.25
Prazeres et al.,87 20151993; 16Agglomerative hierarchical cluster analysisAverage linkageYule QNS
Ruiz et al.,88 20152 788 900 aged ≥65 years; 20Correspondence analysisGraphic of category points in selected dimensions to quantify information explainedData matrixBenzécri inertia adjustment; Greenacre approximation
Schäfer et al.,89 2010149 280 aged ≥65 years; 46Exploratory factor analysis (stratified by sex)Principal-factor method; oblique (Oblimin) rotationTetrachoric correlation matrixEigenvalue ≥1; factor loadings ≥0.25
Sideris et al.,90 2016No. of individuals and conditions NSAgglomerative hierarchical cluster analysisWard’s minimum-variance methodJaccard coefficientDendrogram; cut level of 1.5 determined based on a validation procedure
Vu et al.,91 20113729 with at least two conditions and aged ≥65 years (fall-related injuries); 14Agglomerative hierarchical cluster analysisAverage linkageJaccard coefficient; Yule QCalinski/Harabasz index; Duda/Hart index
Walker et al.,92 20165647 aged ≥55 years; 19Exploratory factor analysisPrincipal-factor method; oblique (Oblimin) rotationTetrachoric correlation matrixScree plot; Eigenvalue ≥1; factor loadings >0.25
Wang et al.,93 20151480 aged ≥60 years; 16Exploratory factor analysisExtraction method NS; oblique (Oblimin) rotationTetrachoric correlation matrixEigenvalue >1; factor loadings ≥0.25

NS, not specified; CFI, comparative fit index; TLI, Tucker-Lewis index; RMSEA, root mean square error approximation; SRMR, standardized root mean square residual; ID, intellectual disabilities.

Under RMSR and SRMR, the extracted factor construct is satisfactory if the test is less than 0.10 and 0.05, respectively.94,95 When using TLI and CFI, acceptable factor constructs are obtained if both statistics are greater than 0.95.95

Table 1.

Analytical methodology for the identification of multimorbid condition groups (41 studies)

Study, yearNo. of individuals; no. of conditionsStatistical methodClustering algorithmProximity measureDetermination of number of groups
Abad-Díez et al.,58 201472 815; 51Exploratory factor analysisPrincipal-factor method; Oblique (Oblimin) rotationTetrachoric correlation matrixScree plot; factor loadings ≥0.25
Alonso-Moran et al.,59 2015126 889 patients with T2DM; 28Agglomerative hierarchical cluster analysisWard’s minimum-variance methodPearson’s dissimilarityCut-off point of 0.98 imposed on dendrogram
Clerencia-Sierra et al.,60 2015924 patients aged ≥65 years; 59Exploratory factor analysis (stratified by sex)Principal-factor method; rotation method NSTetrachoric correlation matrixScree plot; factor loadings ≥0.25
Cornell et al.,61 20071 645 314; 45Agglomerative hierarchical cluster analysisLance-Williams method with β set at –0.50Jaccard coefficientEvaluated by seven health services researchers relative to three clinical criteria
Dong et al.,62 2013496; 16Agglomerative hierarchical cluster analysis (stratified by sex)Average linkageYule QCut-off between 0.2 and 0.3 imposed on dendrogram
Dorenkamp et al.,63 20163386; 15Agglomerative hierarchical cluster analysisWard’s minimum-variance methodSquared Euclidean distanceAgglomerative coefficient; dendrogram; Pseudo F statistic
Foguet-Boreu et al.,64 2015322 328 aged ≥65 years; 99Agglomerative hierarchical cluster analysis (stratified by age group and sex)Ward’s minimum-variance methodJaccard coefficientSemi-partial R2; Calinski-Harabasz Pseudo F; Pseudo T2 statistic
Formiga et al.,65 2013328 aged 85 years; 16Agglomerative hierarchical cluster analysisAverage agglomerationYule QNS
Gabilondo et al.,66 20172 255 406; 27Agglomerative hierarchical cluster analysis (stratified by sex)Ward’s minimum-variance methodPearson’s dissimilarityCut-off based on clinical importance
García-Olmos et al.,67 2012198 670; 26Multiple correspondence analysis (account for age and sex)Graphic of category points in selected dimensions to indicate ‘importance’Data matrixBenzécri inertia adjustment; inertia >0.05
Garin et al.,68 20143625 aged >50 years; 11Exploratory factor analysisExtraction method NS; Oblique (Oblimin) rotationTetrachoric correlation matrixEigenvalue ≥1; factor loadings >0.25
Garin et al.,69 201641 909 aged >50 years; 12Exploratory factor analysisExtraction method NS, Oblique rotationTetrachoric correlation matrixParallel analysis by simulation; Scree test
Goldstein et al.,70 20103591 homeless veterans; 16Factor analysisPrincipal-component method; Varimax rotationPearson-correlation matrixEigenvalue ≥1
Gu et al.,71 20172452 aged ≥60 years; 13Factor analysisPrincipal-component method; Oblique (Oblimin) rotationTetrachoric correlation matrixEigenvalue >1; factor loadings ≥0.25
Held et al.,54 20161464 aged ≥70 years; 17Network and cluster analysisModularity class analysisForceAtlas2 layout (geometric distance and degree of nodes)Resolution = 0.6
Hermans and Evenhuis,72 20141047 with at least two conditions and aged ≥50 years with ID; 14Exploratory factor analysisMaximum-likelihood method; Varimax rotationTetrachoric correlation matrixCFI; RMSEA; SRMR
Herr et al.,73 2015117 with diagnosed heart failure; 10Factor analysisPrincipal-component method; Oblimin rotationPearson-correlation matrixScree plot
Holden et al.,28 201178 430; 23Exploratory factor analysisExtraction method NS; orthogonal quartimax/quartimin rotationTetrachoric correlation matrixScree plot; Eigenvalue >1.0; SRMR < 0.05; CFI & TLI > 0.95; factor loadings (magnitude) >0.4
Islam et al.,74 20144574; 10Agglomerative hierarchical cluster analysis; principal-component analysisAverage linkage; Varimax rotationYule Q; tetrachoric correlation matrixDendrogram; Agglomerative coefficient; Scree plot; Eigenvalue >1; SRMSR; CFI; TLI; factor loadings (magnitude) >0.3
Jackson et al.,75 20157270 women aged >75 years; 31Exploratory factor analysisPrincipal-factor method; Varimax rotationTetrachoric correlation matrixEigenvalue >1; Scree plot
Jackson et al.,76 20164896 women; 31Exploratory factor analysisPrincipal-factor method; Varimax rotationTetrachoric correlation matrixEigenvalue >1; Scree plot
John et al.,25 20031039 aged ≥60 years; 11Agglomerative hierarchical cluster analysisComplete linkage; average linkagePhi coefficient; Yule QSensitivity analysis; Goodman and Kruskal’s gamma
Jovic et al.,77 201613 103 aged ≥20 years; 13Exploratory factor analysis (stratified by age group and sex)Principal-component method; Varimax orthogonal rotationNSScree plot; Factor loadings >0.25; Eigenvalues >1
Kim et al.,78 20121844 HIV-infected patients; 15Exploratory factor analysisExtraction method NS; oblique rotationTetrachoric correlation matrixEigenvalues; CFL; TLI; RMSEA
Kirchberger et al.,79 20124127 aged ≥65 years; 13Exploratory factor analysisPrincipal-factor method; oblique (Oblimin) rotationTetrachoric correlation matrixEigenvalue ≥1; factor loadings ≥0.25
Kumar et al.,80 20172134 aged ≥50 years; 45Hierarchical cluster analysis using treelet transformationPrincipal-component analysisCorrelationDendrogram; cut level of 12 determined by a 10-fold cross-validation procedure
Magnan et al.,81 201829 562 with types 1 or 2 diabetes; 12Exploratory factor analysisRobust weight least squares; oblique (Geomin) rotationTetrachoric correlation matrixEigenvalue >1; interpretability of factors
Marengoni et al.,20 20091099; 15Agglomerative hierarchical cluster analysisAverage linkageYule QNS
Marengoni et al.,82 20101332 aged ≥65 years; 19Agglomerative hierarchical cluster analysisAverage linkageYule QNS
Marengoni et al.,83 2013n1 = 1155, n2 = 1173; 19Agglomerative hierarchical cluster analysisAverage linkageYule QNS
Ng et al.,34 20128841; 24Three-step cluster analysisSearching all comorbid pairs to form groupsSomers’ D statisticAssociation test using the Benjamini–Hochberg procedure
Nurnberg et al.,84 1991110; 12Factor analysisPrincipal-factor method; orthogonal Varimax rotationCorrelation matrixEigenvalue >1; factor loadings ≥0.4
Poblador-Plou et al.,85 2014n1 = 79 291, n2 = 275 682; 114Exploratory factor analysisPrincipal-factor method; oblique (Oblimin) rotationTetrachoric correlation matrixScree plot; factor loadings ≥0.25
Prados-Torres et al.,86 2012275 682; 13–39 for various age group and sexExploratory factor analysis (stratified by age group and sex)Principal-factor method; oblique (Oblimin) rotationTetrachoric correlation matrixScree plot; factor loadings ≥0.25
Prazeres et al.,87 20151993; 16Agglomerative hierarchical cluster analysisAverage linkageYule QNS
Ruiz et al.,88 20152 788 900 aged ≥65 years; 20Correspondence analysisGraphic of category points in selected dimensions to quantify information explainedData matrixBenzécri inertia adjustment; Greenacre approximation
Schäfer et al.,89 2010149 280 aged ≥65 years; 46Exploratory factor analysis (stratified by sex)Principal-factor method; oblique (Oblimin) rotationTetrachoric correlation matrixEigenvalue ≥1; factor loadings ≥0.25
Sideris et al.,90 2016No. of individuals and conditions NSAgglomerative hierarchical cluster analysisWard’s minimum-variance methodJaccard coefficientDendrogram; cut level of 1.5 determined based on a validation procedure
Vu et al.,91 20113729 with at least two conditions and aged ≥65 years (fall-related injuries); 14Agglomerative hierarchical cluster analysisAverage linkageJaccard coefficient; Yule QCalinski/Harabasz index; Duda/Hart index
Walker et al.,92 20165647 aged ≥55 years; 19Exploratory factor analysisPrincipal-factor method; oblique (Oblimin) rotationTetrachoric correlation matrixScree plot; Eigenvalue ≥1; factor loadings >0.25
Wang et al.,93 20151480 aged ≥60 years; 16Exploratory factor analysisExtraction method NS; oblique (Oblimin) rotationTetrachoric correlation matrixEigenvalue >1; factor loadings ≥0.25
Study, yearNo. of individuals; no. of conditionsStatistical methodClustering algorithmProximity measureDetermination of number of groups
Abad-Díez et al.,58 201472 815; 51Exploratory factor analysisPrincipal-factor method; Oblique (Oblimin) rotationTetrachoric correlation matrixScree plot; factor loadings ≥0.25
Alonso-Moran et al.,59 2015126 889 patients with T2DM; 28Agglomerative hierarchical cluster analysisWard’s minimum-variance methodPearson’s dissimilarityCut-off point of 0.98 imposed on dendrogram
Clerencia-Sierra et al.,60 2015924 patients aged ≥65 years; 59Exploratory factor analysis (stratified by sex)Principal-factor method; rotation method NSTetrachoric correlation matrixScree plot; factor loadings ≥0.25
Cornell et al.,61 20071 645 314; 45Agglomerative hierarchical cluster analysisLance-Williams method with β set at –0.50Jaccard coefficientEvaluated by seven health services researchers relative to three clinical criteria
Dong et al.,62 2013496; 16Agglomerative hierarchical cluster analysis (stratified by sex)Average linkageYule QCut-off between 0.2 and 0.3 imposed on dendrogram
Dorenkamp et al.,63 20163386; 15Agglomerative hierarchical cluster analysisWard’s minimum-variance methodSquared Euclidean distanceAgglomerative coefficient; dendrogram; Pseudo F statistic
Foguet-Boreu et al.,64 2015322 328 aged ≥65 years; 99Agglomerative hierarchical cluster analysis (stratified by age group and sex)Ward’s minimum-variance methodJaccard coefficientSemi-partial R2; Calinski-Harabasz Pseudo F; Pseudo T2 statistic
Formiga et al.,65 2013328 aged 85 years; 16Agglomerative hierarchical cluster analysisAverage agglomerationYule QNS
Gabilondo et al.,66 20172 255 406; 27Agglomerative hierarchical cluster analysis (stratified by sex)Ward’s minimum-variance methodPearson’s dissimilarityCut-off based on clinical importance
García-Olmos et al.,67 2012198 670; 26Multiple correspondence analysis (account for age and sex)Graphic of category points in selected dimensions to indicate ‘importance’Data matrixBenzécri inertia adjustment; inertia >0.05
Garin et al.,68 20143625 aged >50 years; 11Exploratory factor analysisExtraction method NS; Oblique (Oblimin) rotationTetrachoric correlation matrixEigenvalue ≥1; factor loadings >0.25
Garin et al.,69 201641 909 aged >50 years; 12Exploratory factor analysisExtraction method NS, Oblique rotationTetrachoric correlation matrixParallel analysis by simulation; Scree test
Goldstein et al.,70 20103591 homeless veterans; 16Factor analysisPrincipal-component method; Varimax rotationPearson-correlation matrixEigenvalue ≥1
Gu et al.,71 20172452 aged ≥60 years; 13Factor analysisPrincipal-component method; Oblique (Oblimin) rotationTetrachoric correlation matrixEigenvalue >1; factor loadings ≥0.25
Held et al.,54 20161464 aged ≥70 years; 17Network and cluster analysisModularity class analysisForceAtlas2 layout (geometric distance and degree of nodes)Resolution = 0.6
Hermans and Evenhuis,72 20141047 with at least two conditions and aged ≥50 years with ID; 14Exploratory factor analysisMaximum-likelihood method; Varimax rotationTetrachoric correlation matrixCFI; RMSEA; SRMR
Herr et al.,73 2015117 with diagnosed heart failure; 10Factor analysisPrincipal-component method; Oblimin rotationPearson-correlation matrixScree plot
Holden et al.,28 201178 430; 23Exploratory factor analysisExtraction method NS; orthogonal quartimax/quartimin rotationTetrachoric correlation matrixScree plot; Eigenvalue >1.0; SRMR < 0.05; CFI & TLI > 0.95; factor loadings (magnitude) >0.4
Islam et al.,74 20144574; 10Agglomerative hierarchical cluster analysis; principal-component analysisAverage linkage; Varimax rotationYule Q; tetrachoric correlation matrixDendrogram; Agglomerative coefficient; Scree plot; Eigenvalue >1; SRMSR; CFI; TLI; factor loadings (magnitude) >0.3
Jackson et al.,75 20157270 women aged >75 years; 31Exploratory factor analysisPrincipal-factor method; Varimax rotationTetrachoric correlation matrixEigenvalue >1; Scree plot
Jackson et al.,76 20164896 women; 31Exploratory factor analysisPrincipal-factor method; Varimax rotationTetrachoric correlation matrixEigenvalue >1; Scree plot
John et al.,25 20031039 aged ≥60 years; 11Agglomerative hierarchical cluster analysisComplete linkage; average linkagePhi coefficient; Yule QSensitivity analysis; Goodman and Kruskal’s gamma
Jovic et al.,77 201613 103 aged ≥20 years; 13Exploratory factor analysis (stratified by age group and sex)Principal-component method; Varimax orthogonal rotationNSScree plot; Factor loadings >0.25; Eigenvalues >1
Kim et al.,78 20121844 HIV-infected patients; 15Exploratory factor analysisExtraction method NS; oblique rotationTetrachoric correlation matrixEigenvalues; CFL; TLI; RMSEA
Kirchberger et al.,79 20124127 aged ≥65 years; 13Exploratory factor analysisPrincipal-factor method; oblique (Oblimin) rotationTetrachoric correlation matrixEigenvalue ≥1; factor loadings ≥0.25
Kumar et al.,80 20172134 aged ≥50 years; 45Hierarchical cluster analysis using treelet transformationPrincipal-component analysisCorrelationDendrogram; cut level of 12 determined by a 10-fold cross-validation procedure
Magnan et al.,81 201829 562 with types 1 or 2 diabetes; 12Exploratory factor analysisRobust weight least squares; oblique (Geomin) rotationTetrachoric correlation matrixEigenvalue >1; interpretability of factors
Marengoni et al.,20 20091099; 15Agglomerative hierarchical cluster analysisAverage linkageYule QNS
Marengoni et al.,82 20101332 aged ≥65 years; 19Agglomerative hierarchical cluster analysisAverage linkageYule QNS
Marengoni et al.,83 2013n1 = 1155, n2 = 1173; 19Agglomerative hierarchical cluster analysisAverage linkageYule QNS
Ng et al.,34 20128841; 24Three-step cluster analysisSearching all comorbid pairs to form groupsSomers’ D statisticAssociation test using the Benjamini–Hochberg procedure
Nurnberg et al.,84 1991110; 12Factor analysisPrincipal-factor method; orthogonal Varimax rotationCorrelation matrixEigenvalue >1; factor loadings ≥0.4
Poblador-Plou et al.,85 2014n1 = 79 291, n2 = 275 682; 114Exploratory factor analysisPrincipal-factor method; oblique (Oblimin) rotationTetrachoric correlation matrixScree plot; factor loadings ≥0.25
Prados-Torres et al.,86 2012275 682; 13–39 for various age group and sexExploratory factor analysis (stratified by age group and sex)Principal-factor method; oblique (Oblimin) rotationTetrachoric correlation matrixScree plot; factor loadings ≥0.25
Prazeres et al.,87 20151993; 16Agglomerative hierarchical cluster analysisAverage linkageYule QNS
Ruiz et al.,88 20152 788 900 aged ≥65 years; 20Correspondence analysisGraphic of category points in selected dimensions to quantify information explainedData matrixBenzécri inertia adjustment; Greenacre approximation
Schäfer et al.,89 2010149 280 aged ≥65 years; 46Exploratory factor analysis (stratified by sex)Principal-factor method; oblique (Oblimin) rotationTetrachoric correlation matrixEigenvalue ≥1; factor loadings ≥0.25
Sideris et al.,90 2016No. of individuals and conditions NSAgglomerative hierarchical cluster analysisWard’s minimum-variance methodJaccard coefficientDendrogram; cut level of 1.5 determined based on a validation procedure
Vu et al.,91 20113729 with at least two conditions and aged ≥65 years (fall-related injuries); 14Agglomerative hierarchical cluster analysisAverage linkageJaccard coefficient; Yule QCalinski/Harabasz index; Duda/Hart index
Walker et al.,92 20165647 aged ≥55 years; 19Exploratory factor analysisPrincipal-factor method; oblique (Oblimin) rotationTetrachoric correlation matrixScree plot; Eigenvalue ≥1; factor loadings >0.25
Wang et al.,93 20151480 aged ≥60 years; 16Exploratory factor analysisExtraction method NS; oblique (Oblimin) rotationTetrachoric correlation matrixEigenvalue >1; factor loadings ≥0.25

NS, not specified; CFI, comparative fit index; TLI, Tucker-Lewis index; RMSEA, root mean square error approximation; SRMR, standardized root mean square residual; ID, intellectual disabilities.

Under RMSR and SRMR, the extracted factor construct is satisfactory if the test is less than 0.10 and 0.05, respectively.94,95 When using TLI and CFI, acceptable factor constructs are obtained if both statistics are greater than 0.95.95

Analytical methods

Our review identified five analytical methods used for identifying multimorbid condition groups (Table 1). We described these five methods below, along with their adaptations specifically for multimorbidity research.

Factor-analysis method

The role of factor analysis is to identify ‘latent’ factors based on the assumption that variables associated with the same factor share a common underlying trait that is responsible for the correlation among them. The implementation of factor analysis begins by factoring a correlation or covariance matrix that measures the pairwise associations between the observed variables. For the dichotomous morbidity variable, a tetrachoric correlation matrix will lead to more valid results.39 Factoring the matrix requires statistical estimation procedures such as the principal-factor method, maximum-likelihood method and least squares, among others.40–42

A crucial component of factor analysis is the decision about the number of factors to retain/extract. Common factor-extraction criteria are the eigenvalue-greater-than-one rule, the graphical scree test, parellel analysis and the method of explained variance. Studies have shown that the eigenvalue-greater-than-one rule leads to overextraction (inclusion of extraneous factors),43,44 especially in the presence of outliers, because their effect can cause eigenvalues to increase dramatically.45 The method of specified variance leads to few errors, although it can yield too many factors and lead to constructs with little theoretical value.45 Goodness-of-fit indexes are often considered to confirm the adequacy of the selected number of factors, including Tucker-Lewis index (TLI), root mean square residual (RMSR), standardized root mean square residual (SRMR) and comparative fit index (CFI).

Factor rotation is also required to obtain a solution with a simple structure that is parsimonious and easily interpretable. Two broad classes of rotation techniques: orthogonal rotation (encompasses Varimax, Quartimax and Equamax) tries to separate variables into non-overlapping clusters, each possibly associated with one factor in a way that factors are independent; and Oblique rotation, such as Oblimin, Quartimin and Promax, permits correlation among the factors, so that variables can be associated with more than one factor.

Hierarchical-clustering method

The aim in cluster analysis is to assign entities (such as health conditions) into groups (called clusters) so that entities in the same cluster are more alike to one another than entities from different clusters. Cluster analysis is also known as ‘unsupervised classification’, where there is no a priori information regarding the underlying group structure.9 Hierarchical algorithms46 define a dendrogram relating similar entities in the same subtrees, based on a proximity (distance) measure of choice among entities. Agglomerative hierarchical methods form clusters by progressive fusions, where each entity is assigned to its own cluster initially, and then proceed by joining the two most similar clusters iteratively until there is a single cluster with all entities. At each stage, distances between clusters are updated according to the particular method being selected to define the distance between two clusters. Typical methods include ‘Average linkage’ (the distance between two clusters is defined by the average of the pairwise distances between two entities, one from each of the two clusters), ‘Single linkage’ (defined as the minimum of the pairwise distances), ‘Complete linkage’ (defined as the maximum of the pairwise distances), ‘Ward’s’ (defined by the minimum variance in terms of the pooled within-cluster sum of squares) and the Lance and Williams47 distance formula that covers the above methods as a special case. Suppose that cluster i and cluster j are being combined to form the single cluster k, then the Lance and Williams47 formula expresses the distance d(k, l) between this new cluster k and any currently existing cluster l as:
(1)
where the coefficients αi , αj , β  and γ characterize the particular methods. The methods for updating the distance between clusters often have a great impact on the final clustering results.

Proximity between entities can be quantified by either a similarity or a dissimilarity measure. In multimorbidity research, the proximity between two conditions is based on binary variables indicating the presence or absence of conditions in each individual and can be presented as a two-by-two contingency table.

Unified-clustering algorithm

Ng et al.34 proposed a three-step unified-clustering method to identify groups of multimorbid conditions. The method specifically addresses three statistical issues for using cluster analysis to study multimorbidity patterns, namely the adjustment for multimorbidity by chance, the uniqueness of clustering results and the control for false discovery. To this end, the first step adopts the asymmetric Somers’ D statistic to quantify the degree of non-random multimorbidity between any two conditions. The Somers’ D statistic adjusts for multimorbidity by chance and offers higher power to detect non-random multimorbidity compared with other common concordance statistics such as kappa, Kendall’s Tau-b, gamma and adjusted Rand index.34 With p conditions, the first step computes asymmetric Somers’ D statistics for p(p-1)/2 distinct pairs of conditions. The second step assesses their strength of association using the Benjamini–Hochberg procedure to control for the false discovery rate (FDR) at level α and offer protection against false positives.48 Suppose that q pairs of conditions are truly non-random, this procedure obtains the null distribution of Somers’ D statistics using a permutation approach and sets the error rate so that the expected number of false positives among these q pairs is less than one.34 In the third step, the algorithm adopts a ‘clumping’ clustering technique to search all comorbid pairs of conditions and then form groups of multimorbid conditions in which all conditions belonging to a group coexist with one another. The strength of multimorbidity among conditions in a group is given by the averaged Somers’ D statistics for all pairs of conditions belonging to the group.

This three-step method identifies overlapping groups of multimorbid conditions, which are easy to interpret. It is more realistic that some health conditions could belong to more than one multimorbidity group simultaneously.28 For the subsequent clustering of individuals on the basis of their multimorbidity patterns, Ng9 proposed a procedure to convert the overlapping groups into non-overlapping groups of multimorbid conditions. An updated version of the procedure is available, where we add an extra criterion in forming the non-overlapping groups. This new criterion reflects the idea that a given health condition and its ‘closest’ condition (with which the pairwise Somers’ D statistic is maximum) should form a multimorbid pair and belong to the same group. The use of UCINET6 for Windows49 can display graphically the network of health conditions, which are classified either as a ‘closed’ non-overlapping group (all conditions in a group coexist with one another) or a ‘semi-closed’ non-overlapping group (conditions in a group coexist with more than half of the members).

Multiple correspondence analysis

Multiple correspondence analysis (MCA)50 is a non-parametric multivariate method that uses graphical procedures to reveal the association between categorical variables, including binary, nominal and ordinal.51 The method attempts to present multivariate categorical data in a low-dimensional space (a counterpart of principal-component analysis for categorical data).

MCA is concerned with the decomposition of a data matrix to generate diagrams of row and column coordinates, called maps, which display the associations amongst (within) the variables. The decomposition can be approached using a Burt matrix.52 Obtaining the coordinates of row and column profiles with respect to the principal axes (dimensions) involve eigenvalue-eigenvector decomposition on the Burt matrix.53 The Burt matrix is a cross-product of an indicator matrix that arrays together all two-way cross-tabulations of the entire variables. It is symmetrical and its decomposition yields a solution with row and column profiles that account for the largest proportion of variance (inertia). The Burt matrix approach may overestimate the total inertia due to the existence of submatrices on the main diagonal that are cross-tabulations of each variable with itself.52,53 This problem can be avoided by using either adjusted inertias or joint correspondence analysis.53

Like most dimension-reduction techniques, the optimal number of dimensions to retain is of concern in MCA. This is done by defining a threshold, thus dividing 1 by the total number of variables in use.52 The dimensions with inertia or adjusted inertia greater than the defined threshold are therefore retained. Interpretation can be done by inspecting the map taking into consideration the dimensions on which each response category is best positioned based on proximity. The positions of the response categories along a given dimension set out the magnitude and sign of their loadings on that dimension. Response categories with the same sign on a given dimension are deemed to be in association with, but inversely related to, any category with an opposite sign.

Network and cluster analyses

Held et al.54 proposed the uses of network analysis and cluster analysis to reveal networks of conditions. Network analysis was performed using Gephi software through a ForceAtlas2 algorithm, which is a force-directed continuous graph layout algorithm for network visualization designed for the Gephi software.55 ForceAtlas2 simulates a physical system in order to spatialize a network, where nodes (representing the conditions) repulse each other like charged particles and edges attract their nodes like springs such that these forces create a movement that converges to a balanced state.55 Different formulae have been adopted to represent the attraction and repulsion forces; the ForceAtlas2 layout uses a classical attraction force Fa that depends linearly on the distance between two nodes d(n1, n2) and a repulsion force Fr that is inversely proportional to the distance d(n1, n2) as:
(2)
(3)
where n1 and n2 are two connected nodes, the function deg(·) indicates the degree of each node (which is defined as the number of connected nodes that a node has) and kr is the constant coefficient for the replusive force. Increasing kr expands the size of the graph. A modularity (or collective proximity) analysis56 with a resolution of 0.657 is then applied to the layout of conditions in order to cluster the conditions into non-overlapping groups (or communities) of conditions.54 Lower resolution increases the number of condition groups.

Summary of review findings

The majority of the 41 fully reviewed papers presented in Table 1 adopted the factor-analysis method (21 studies; 51%), indicating that factor analysis was the most widely used analytical approach for identifying multimorbid condition groups in previous studies. Almost all of them (18 studies) used tetrachoric correlation matrices to account for binary morbidity data. Among them, 14 studies used the oblique method for rotation of factors.58,68,71,96,97 Only eight studies used the Varimax or the orthogonal quartimax rotation methods.28,72,74

The next commonly used analytical method was the hierarchical-clustering algorithm (16 studies; 39%). The average linkage,20,25,61,62,64,82,87,91 the Ward’s minimum-variance method59,63,64,98–102 and the Lance and Williams formula61,101,103 were the common methods adopted to update distances between clusters. Typical similarity measures adopted were Jaccard coefficient61,64,91,101,103 and Yule Q.20,25,62,65,74,82,87,104

The other three analytical methods were only adopted in recent studies. In particular, the unified three-step clustering method was developed specifically for clustering health conditions,34 the MCA technique is a multivariate statistical method specifically for uncovering associations among categorical (binary) morbidity data67,105 and the network and cluster analyses were employed to identify sub-networks or groups of connected health conditions.54

The quality of reporting of the 41 included studies was generally good, with quality scores rated from 14 to 21 out of 23 (see the detailed rating of these studies provided in Supplementary Table 3, available as Supplementary data at IJE online). The included papers with a lower reporting quality score were mainly methodology papers, which focused on the method for identifying groups of multimorbid conditions and illustrated its applicability using real datasets as an example.

Comparison analysis results

We applied the five analytical methods to the same Australian NHS 2007–08 data. We considered 25 health conditions, which prevalence rates are given in Table 2. Supplementary Figures 14, available as Supplementary data at IJE online, provide the algorithms in R (R Foundation for Statistical Computing, Vienna, Austria) for the first four analytical methods to identify multimorbid groups of these 25 conditions. In Supplementary Figure 5, available as Supplementary data at IJE online, we display a snapshot to illustrate the fifth method with the network and cluster analyses. Table 2 summarizes the identified groups of multimorbid conditions.

Table 2.

List of 25 health conditions and the groups of multimorbid health conditions identified by various analytical methods (NHS data)

Prevalence rate of health condition (in decreasing order)Analytical method (specific characteristics)Numbers of groups and conditionsGroups of multimorbid health conditions (number of conditions in each group)
Hay fever (16.4%) Hypertension (11.0%) Asthma (10.5%)  Back pain (9.4%) Osteoarthritis (9.2%) Sinusitis (9.0%) Cholesterol (6.6%) Migraine (5.8%)  Disc disorder (5.7%) Food allergy (5.6%) Mood disorder (5.5%)  Gout (5.2%) Osteoporosis (3.8%) Anxiety (3.6%) Diabetes (3.6%) Depression (3.5%) Thyroid (2.8%) Rheumatoid (2.6%) Psoriasis (2.3%)  Hernia (2.2%) Bronchitis (2.0%) Angina (2.0%)  Anaemia (1.8%) Oedema (1.7%) Dermatitis (1.1%)Factor analysis (oblique Oblimin rotation)Five groups of 13 health conditionsDiabetes, Cholesterol, Angina, Hypertension (4)  Anxiety, Depression, Mood disorder (3)  Osteoarthritis, Osteoporosis (2)  Hayfever, Sinusitis (2)  Bronchitis, Asthma (2)
Factor analysis (orthogonal quartimax rotation)aFive groups of 16 health conditionsDiabetes, Cholesterol, Angina, Oedema, Hypertension, Osteoarthritis, Osteoporosis, Gout (8)  Anxiety, Depression, Mood disorder, Migraine (4)  Osteoarthritis, Osteoporosis (2)  Hayfever, Sinusitis (2)  Bronchitis, Asthma (2)
Hierarchical-clustering algorithm (Yule Q coefficient, average linkage)Five groups of 22 health conditionsDiabetes, Cholesterol, Angina, Oedema, Hypertension, Dermatitis, Osteoarthritis, Rheumatoid, Osteoporosis, Gout (10)  Anxiety, Depression, Mood disorder, Migraine (4)  Hayfever, Sinusitis, Bronchitis, Asthma (4)  Anaemias, Thyroid (2)  Hernia, Disc disorder (2)
Hierarchical-clustering algorithm (Jaccard coefficient, average linkage)Two groups of 19 health conditionsDiabetes, Cholesterol, Angina, Oedema, Hypertension, Disc disorder, Osteoarthritis, Osteoporosis, Gout (9)  Anxiety, Depression, Mood disorder, Migraine, Hayfever, Sinusitis, Bronchitis, Asthma, Backpain, Food allergy (10)
Three-step unified-clustering methodbFive groups of 19 health conditionsDiabetes, Cholesterol, Angina, Oedema, Hypertension, Gout (6) Osteoarthritis, Osteoporosis, Thyroid (3)  Bronchitis, Asthma (2)  Migraine, Hayfever, Sinusitis, Food allergy (4)  Anxiety, Depression, Mood disorder, Backpain (4)
Multiple correspondence analysis (two dimensions)Four groups of 19 health conditionsThyroid, Rheumatoid, Hernia, Osteoarthritis, Osteoporosis (5) Diabetes, Cholesterol, Hypertension, Gout (4)  Backpain, Psoriasis (2)  Asthma, Hayfever, Food allergy, Sinusitis, Mood disorder, Migraine, Anxiety, Depression (8)
Multiple correspondence analysis (three dimensions)cSix groups of 22 health conditionsDisc disorder, Thyroid, Rheumatoid, Hernia, Osteoarthritis, Osteroporosis (6) Diabetes, Cholesterol, Hypertension, Gout (4) Backpain, Psoriasis (2)  Asthma, Hayfever, Food allergy, Sinusitis, Mood disorder, Migraine, Anxiety, Depression, Bronchitis, Anaemias (10)  Food allergy, Asthma, Hayfever, Sinusitis (4)  Anxiety, Depression, Mood disorder (3)
Network and cluster analyses methodFive groups of 25 health conditionsDiabetes, Cholesterol, Angina, Oedema, Hypertension, Gout (6) Hayfever, Asthma, Sinusitis, Bronchitis, Dermatitis, Food allergy (6)  Anxiety, Depression, Mood disorder, Disc disorder (4)  Migraine, Backpain, Anaemias, Psoriasis (4)  Thyroid, Rheumatoid, Hernia, Osteoarthritis, Osteoporosis (5)
Prevalence rate of health condition (in decreasing order)Analytical method (specific characteristics)Numbers of groups and conditionsGroups of multimorbid health conditions (number of conditions in each group)
Hay fever (16.4%) Hypertension (11.0%) Asthma (10.5%)  Back pain (9.4%) Osteoarthritis (9.2%) Sinusitis (9.0%) Cholesterol (6.6%) Migraine (5.8%)  Disc disorder (5.7%) Food allergy (5.6%) Mood disorder (5.5%)  Gout (5.2%) Osteoporosis (3.8%) Anxiety (3.6%) Diabetes (3.6%) Depression (3.5%) Thyroid (2.8%) Rheumatoid (2.6%) Psoriasis (2.3%)  Hernia (2.2%) Bronchitis (2.0%) Angina (2.0%)  Anaemia (1.8%) Oedema (1.7%) Dermatitis (1.1%)Factor analysis (oblique Oblimin rotation)Five groups of 13 health conditionsDiabetes, Cholesterol, Angina, Hypertension (4)  Anxiety, Depression, Mood disorder (3)  Osteoarthritis, Osteoporosis (2)  Hayfever, Sinusitis (2)  Bronchitis, Asthma (2)
Factor analysis (orthogonal quartimax rotation)aFive groups of 16 health conditionsDiabetes, Cholesterol, Angina, Oedema, Hypertension, Osteoarthritis, Osteoporosis, Gout (8)  Anxiety, Depression, Mood disorder, Migraine (4)  Osteoarthritis, Osteoporosis (2)  Hayfever, Sinusitis (2)  Bronchitis, Asthma (2)
Hierarchical-clustering algorithm (Yule Q coefficient, average linkage)Five groups of 22 health conditionsDiabetes, Cholesterol, Angina, Oedema, Hypertension, Dermatitis, Osteoarthritis, Rheumatoid, Osteoporosis, Gout (10)  Anxiety, Depression, Mood disorder, Migraine (4)  Hayfever, Sinusitis, Bronchitis, Asthma (4)  Anaemias, Thyroid (2)  Hernia, Disc disorder (2)
Hierarchical-clustering algorithm (Jaccard coefficient, average linkage)Two groups of 19 health conditionsDiabetes, Cholesterol, Angina, Oedema, Hypertension, Disc disorder, Osteoarthritis, Osteoporosis, Gout (9)  Anxiety, Depression, Mood disorder, Migraine, Hayfever, Sinusitis, Bronchitis, Asthma, Backpain, Food allergy (10)
Three-step unified-clustering methodbFive groups of 19 health conditionsDiabetes, Cholesterol, Angina, Oedema, Hypertension, Gout (6) Osteoarthritis, Osteoporosis, Thyroid (3)  Bronchitis, Asthma (2)  Migraine, Hayfever, Sinusitis, Food allergy (4)  Anxiety, Depression, Mood disorder, Backpain (4)
Multiple correspondence analysis (two dimensions)Four groups of 19 health conditionsThyroid, Rheumatoid, Hernia, Osteoarthritis, Osteoporosis (5) Diabetes, Cholesterol, Hypertension, Gout (4)  Backpain, Psoriasis (2)  Asthma, Hayfever, Food allergy, Sinusitis, Mood disorder, Migraine, Anxiety, Depression (8)
Multiple correspondence analysis (three dimensions)cSix groups of 22 health conditionsDisc disorder, Thyroid, Rheumatoid, Hernia, Osteoarthritis, Osteroporosis (6) Diabetes, Cholesterol, Hypertension, Gout (4) Backpain, Psoriasis (2)  Asthma, Hayfever, Food allergy, Sinusitis, Mood disorder, Migraine, Anxiety, Depression, Bronchitis, Anaemias (10)  Food allergy, Asthma, Hayfever, Sinusitis (4)  Anxiety, Depression, Mood disorder (3)
Network and cluster analyses methodFive groups of 25 health conditionsDiabetes, Cholesterol, Angina, Oedema, Hypertension, Gout (6) Hayfever, Asthma, Sinusitis, Bronchitis, Dermatitis, Food allergy (6)  Anxiety, Depression, Mood disorder, Disc disorder (4)  Migraine, Backpain, Anaemias, Psoriasis (4)  Thyroid, Rheumatoid, Hernia, Osteoarthritis, Osteoporosis (5)

Main similarities among the identified groupings of conditions are in bold: (i) cardiovascular and metabolic diseases: Diabetes, Cholesterol, Angina, Oedema, Hypertension, Gout; (ii) mental health problems: Anxiety, Depression, Mood disorder; and (iii) allergic diseases: Hayfever, Sinusitis, Food allergy.

aOsteoarthritis and osteoporosis were overlapped in Groups 1 and 3.

bThe first three groups were closed, whereas the last two groups were semi-closed (see text for definitions).

cAsthma, Hayfever, Food allergy, Sinusitis, Mood disorder, Anxiety and Depression overlapped.

Table 2.

List of 25 health conditions and the groups of multimorbid health conditions identified by various analytical methods (NHS data)

Prevalence rate of health condition (in decreasing order)Analytical method (specific characteristics)Numbers of groups and conditionsGroups of multimorbid health conditions (number of conditions in each group)
Hay fever (16.4%) Hypertension (11.0%) Asthma (10.5%)  Back pain (9.4%) Osteoarthritis (9.2%) Sinusitis (9.0%) Cholesterol (6.6%) Migraine (5.8%)  Disc disorder (5.7%) Food allergy (5.6%) Mood disorder (5.5%)  Gout (5.2%) Osteoporosis (3.8%) Anxiety (3.6%) Diabetes (3.6%) Depression (3.5%) Thyroid (2.8%) Rheumatoid (2.6%) Psoriasis (2.3%)  Hernia (2.2%) Bronchitis (2.0%) Angina (2.0%)  Anaemia (1.8%) Oedema (1.7%) Dermatitis (1.1%)Factor analysis (oblique Oblimin rotation)Five groups of 13 health conditionsDiabetes, Cholesterol, Angina, Hypertension (4)  Anxiety, Depression, Mood disorder (3)  Osteoarthritis, Osteoporosis (2)  Hayfever, Sinusitis (2)  Bronchitis, Asthma (2)
Factor analysis (orthogonal quartimax rotation)aFive groups of 16 health conditionsDiabetes, Cholesterol, Angina, Oedema, Hypertension, Osteoarthritis, Osteoporosis, Gout (8)  Anxiety, Depression, Mood disorder, Migraine (4)  Osteoarthritis, Osteoporosis (2)  Hayfever, Sinusitis (2)  Bronchitis, Asthma (2)
Hierarchical-clustering algorithm (Yule Q coefficient, average linkage)Five groups of 22 health conditionsDiabetes, Cholesterol, Angina, Oedema, Hypertension, Dermatitis, Osteoarthritis, Rheumatoid, Osteoporosis, Gout (10)  Anxiety, Depression, Mood disorder, Migraine (4)  Hayfever, Sinusitis, Bronchitis, Asthma (4)  Anaemias, Thyroid (2)  Hernia, Disc disorder (2)
Hierarchical-clustering algorithm (Jaccard coefficient, average linkage)Two groups of 19 health conditionsDiabetes, Cholesterol, Angina, Oedema, Hypertension, Disc disorder, Osteoarthritis, Osteoporosis, Gout (9)  Anxiety, Depression, Mood disorder, Migraine, Hayfever, Sinusitis, Bronchitis, Asthma, Backpain, Food allergy (10)
Three-step unified-clustering methodbFive groups of 19 health conditionsDiabetes, Cholesterol, Angina, Oedema, Hypertension, Gout (6) Osteoarthritis, Osteoporosis, Thyroid (3)  Bronchitis, Asthma (2)  Migraine, Hayfever, Sinusitis, Food allergy (4)  Anxiety, Depression, Mood disorder, Backpain (4)
Multiple correspondence analysis (two dimensions)Four groups of 19 health conditionsThyroid, Rheumatoid, Hernia, Osteoarthritis, Osteoporosis (5) Diabetes, Cholesterol, Hypertension, Gout (4)  Backpain, Psoriasis (2)  Asthma, Hayfever, Food allergy, Sinusitis, Mood disorder, Migraine, Anxiety, Depression (8)
Multiple correspondence analysis (three dimensions)cSix groups of 22 health conditionsDisc disorder, Thyroid, Rheumatoid, Hernia, Osteoarthritis, Osteroporosis (6) Diabetes, Cholesterol, Hypertension, Gout (4) Backpain, Psoriasis (2)  Asthma, Hayfever, Food allergy, Sinusitis, Mood disorder, Migraine, Anxiety, Depression, Bronchitis, Anaemias (10)  Food allergy, Asthma, Hayfever, Sinusitis (4)  Anxiety, Depression, Mood disorder (3)
Network and cluster analyses methodFive groups of 25 health conditionsDiabetes, Cholesterol, Angina, Oedema, Hypertension, Gout (6) Hayfever, Asthma, Sinusitis, Bronchitis, Dermatitis, Food allergy (6)  Anxiety, Depression, Mood disorder, Disc disorder (4)  Migraine, Backpain, Anaemias, Psoriasis (4)  Thyroid, Rheumatoid, Hernia, Osteoarthritis, Osteoporosis (5)
Prevalence rate of health condition (in decreasing order)Analytical method (specific characteristics)Numbers of groups and conditionsGroups of multimorbid health conditions (number of conditions in each group)
Hay fever (16.4%) Hypertension (11.0%) Asthma (10.5%)  Back pain (9.4%) Osteoarthritis (9.2%) Sinusitis (9.0%) Cholesterol (6.6%) Migraine (5.8%)  Disc disorder (5.7%) Food allergy (5.6%) Mood disorder (5.5%)  Gout (5.2%) Osteoporosis (3.8%) Anxiety (3.6%) Diabetes (3.6%) Depression (3.5%) Thyroid (2.8%) Rheumatoid (2.6%) Psoriasis (2.3%)  Hernia (2.2%) Bronchitis (2.0%) Angina (2.0%)  Anaemia (1.8%) Oedema (1.7%) Dermatitis (1.1%)Factor analysis (oblique Oblimin rotation)Five groups of 13 health conditionsDiabetes, Cholesterol, Angina, Hypertension (4)  Anxiety, Depression, Mood disorder (3)  Osteoarthritis, Osteoporosis (2)  Hayfever, Sinusitis (2)  Bronchitis, Asthma (2)
Factor analysis (orthogonal quartimax rotation)aFive groups of 16 health conditionsDiabetes, Cholesterol, Angina, Oedema, Hypertension, Osteoarthritis, Osteoporosis, Gout (8)  Anxiety, Depression, Mood disorder, Migraine (4)  Osteoarthritis, Osteoporosis (2)  Hayfever, Sinusitis (2)  Bronchitis, Asthma (2)
Hierarchical-clustering algorithm (Yule Q coefficient, average linkage)Five groups of 22 health conditionsDiabetes, Cholesterol, Angina, Oedema, Hypertension, Dermatitis, Osteoarthritis, Rheumatoid, Osteoporosis, Gout (10)  Anxiety, Depression, Mood disorder, Migraine (4)  Hayfever, Sinusitis, Bronchitis, Asthma (4)  Anaemias, Thyroid (2)  Hernia, Disc disorder (2)
Hierarchical-clustering algorithm (Jaccard coefficient, average linkage)Two groups of 19 health conditionsDiabetes, Cholesterol, Angina, Oedema, Hypertension, Disc disorder, Osteoarthritis, Osteoporosis, Gout (9)  Anxiety, Depression, Mood disorder, Migraine, Hayfever, Sinusitis, Bronchitis, Asthma, Backpain, Food allergy (10)
Three-step unified-clustering methodbFive groups of 19 health conditionsDiabetes, Cholesterol, Angina, Oedema, Hypertension, Gout (6) Osteoarthritis, Osteoporosis, Thyroid (3)  Bronchitis, Asthma (2)  Migraine, Hayfever, Sinusitis, Food allergy (4)  Anxiety, Depression, Mood disorder, Backpain (4)
Multiple correspondence analysis (two dimensions)Four groups of 19 health conditionsThyroid, Rheumatoid, Hernia, Osteoarthritis, Osteoporosis (5) Diabetes, Cholesterol, Hypertension, Gout (4)  Backpain, Psoriasis (2)  Asthma, Hayfever, Food allergy, Sinusitis, Mood disorder, Migraine, Anxiety, Depression (8)
Multiple correspondence analysis (three dimensions)cSix groups of 22 health conditionsDisc disorder, Thyroid, Rheumatoid, Hernia, Osteoarthritis, Osteroporosis (6) Diabetes, Cholesterol, Hypertension, Gout (4) Backpain, Psoriasis (2)  Asthma, Hayfever, Food allergy, Sinusitis, Mood disorder, Migraine, Anxiety, Depression, Bronchitis, Anaemias (10)  Food allergy, Asthma, Hayfever, Sinusitis (4)  Anxiety, Depression, Mood disorder (3)
Network and cluster analyses methodFive groups of 25 health conditionsDiabetes, Cholesterol, Angina, Oedema, Hypertension, Gout (6) Hayfever, Asthma, Sinusitis, Bronchitis, Dermatitis, Food allergy (6)  Anxiety, Depression, Mood disorder, Disc disorder (4)  Migraine, Backpain, Anaemias, Psoriasis (4)  Thyroid, Rheumatoid, Hernia, Osteoarthritis, Osteoporosis (5)

Main similarities among the identified groupings of conditions are in bold: (i) cardiovascular and metabolic diseases: Diabetes, Cholesterol, Angina, Oedema, Hypertension, Gout; (ii) mental health problems: Anxiety, Depression, Mood disorder; and (iii) allergic diseases: Hayfever, Sinusitis, Food allergy.

aOsteoarthritis and osteoporosis were overlapped in Groups 1 and 3.

bThe first three groups were closed, whereas the last two groups were semi-closed (see text for definitions).

cAsthma, Hayfever, Food allergy, Sinusitis, Mood disorder, Anxiety and Depression overlapped.

Factor analysis with tetrachoric correlations led to different groupings of multimorbid conditions when different model-selection and/or rotation methods were used. With the maximum-likelihood method, we identified five groups of 13 health conditions with the commonly used oblique Oblimin rotation method and five groups of 16 conditions when the orthogonal quartimax rotation method was used (Table 2). Supplementary Tables 4 and 5, available as Supplementary data at IJE online, display the factors and their loadings for these two rotation methods.

Hierarchical-clustering algorithms also led to different groupings of multimorbid conditions when different methods of calculating distances between clusters and proximity measures were used. Figure 1 displays the dendrogram using the Yule Q coefficient and the average linkage method. We identified three groups of 22 health conditions (Table 2). Supplementary Figure 6, available as Supplementary data at IJE online, shows the dendrogram of a different grouping result (two groups of 19 conditions) with the uses of the Jaccard coefficient and the average linkage method (Table 2).

Hierarchical clustering of health conditions using Yule Q coefficient and the average linkage method.
Figure 1.

Hierarchical clustering of health conditions using Yule Q coefficient and the average linkage method.

The three-step unified-clustering method obtained initially 18 overlapping groups of multimorbid conditions (see Supplementary Table 6, available as Supplementary data at IJE online). Figure 2 displays the network of identified groups of multimorbid conditions. Osteoarthritis, Hypertension and Sinusitis are the top three conditions with the highest number of multimorbid conditions. Using the conversion procedure, we identified five non-overlapping groups of 19 conditions. Supplementary Table 7, available as Supplementary data at IJE online, shows the pairwise Somers’ D statistics on the degree of non-random multimorbidity among 25 conditions.

With MCA, seven dimensions consisted of principal inertias (see Supplementary Table 8, available as Supplementary data at IJE online), all greater than the average inertia 1/25 = 0.04. Among these seven principal inertias, the last five dimensions made a worthless contribution to the explained inertia, hence the first two dimensions that accounted for 83.1% were retained. Figure 3 displays the results of the MCA based on a Burt matrix for these two dimensions. We identified four groups of 19 conditions (Table 2). Supplementary Figure 7, available as Supplementary data at IJE online, shows a different grouping result (six groups of 22 conditions) corresponding to the MCA based on the Burt matrix of three dimensions, which explain 84.8% of principal inertias (Table 2).

Non-random multimorbidity between 19 conditions using the three-step unified clustering method (nodal size proportional to the number of conditions multimorbid with the condition; bolded lines link the ‘closest’ pairs).
Figure 2.

Non-random multimorbidity between 19 conditions using the three-step unified clustering method (nodal size proportional to the number of conditions multimorbid with the condition; bolded lines link the ‘closest’ pairs).

MCA map for Dimension 2 vs Dimension 1 of health conditions based on Burt matrix with adjusted inertia: 0 = absence of condition, 1 = presence of condition.
Figure 3.

MCA map for Dimension 2 vs Dimension 1 of health conditions based on Burt matrix with adjusted inertia: 0 = absence of condition, 1 = presence of condition.

The network and cluster analyses, using Gephi software (version 0.9.1) through a ForceAtlas2 layout algorithm with a resolution of 0.6 and edge weight influence of 1.0, identified five non-overlapping groups of 25 conditions (Table 2). Figure 4 presents the network of identified groups of conditions.

Network analysis of health conditions using ForceAtlas2 layout algorithm: same shaded/coloured nodes (indicated with the group number) depict a multimorbid health condition group obtained via modularity class analysis (nodal size proportional to the prevalence of the health condition).
Figure 4.

Network analysis of health conditions using ForceAtlas2 layout algorithm: same shaded/coloured nodes (indicated with the group number) depict a multimorbid health condition group obtained via modularity class analysis (nodal size proportional to the prevalence of the health condition).

Discussion

Our systematic review showed that the analytical methods used in different studies to identify groups of multimorbid conditions are heterogeneous, which may explain the variation in the multimorbidity patterns reported in these studies. As illustrated in the comparison analysis using the NHS data, even the same analytical method with different criteria for the proximity measure or the distance between clusters may obtain various results. The specific analytical methodology adopted for grouping multimorbid conditions along with the selection/coding of conditions and the target populations are the key factors to explain the diversity of the multimorbid groups found across different studies.12 Future studies should provide a detailed description of the methodology used when reporting and summarizing patterns of multimorbidity. Comparison of multimorbidity patterns among studies is plausible only if the same analytical method is used for grouping multimorbid health conditions.

The five analytical methods use different approaches to form clusters and adapt to the study of multimorbidity. The measure adopted to quantify co-occurrence between a pair of conditions is particularly crucial in explaining the variation in the identified multimorbid condition groups between these methods. Factor analysis works on a tetrachoric correlation matrix of morbidity data, but the capacity of this measure to adjust for co-occurrence by chance is unknown. An interpretation issue may also arise when negative factor loadings are found for certain latent factors (see Supplementary Tables 4 and 5, available as Supplementary data at IJE online) or when estimates of variances converge to a value close to zero (a Heywood case).106 Hierarchical-clustering methods commonly use Jaccard or Yule Q coefficients for multimorbidity studies. A previous study34 showed that the Jaccard coefficient is inappropriate to measure multimorbidity because it does not adjust for co-occurrence by chance. The three-step clustering method quantifies the degree of co-occurrence of a pair of conditions using the asymmetric Somers’ D statistic with control for the FDR, which showed high power to detect non-random multimorbidity. The MCA works on a Burt matrix, which is the symmetrical matrix of all two-way cross-tabulations between each pair of conditions (an analogy to the covariance matrix for continuous variables). It is necessary to investigate further how this measure performs for adjusting co-occurrence by chance. The method involved both network and cluster analyses relies on Gephi software and a ForceAtlas2 algorithm to visualize the network of conditions based on a force-directed continuous graph layout and a modularity analysis to define clusters of conditions. Similarly to hierarchical-clustering methods, different cut-off values produce clustering results with a distinct number of clusters.

Applying these five methods to the same NHS morbidity data, we found that the methodological discrepancy between these methods had influences on the identification of multimorbidity patterns, leading to different groupings of multimorbid conditions. We extracted the main similarities among these identified groups. The first group consists of a combination of cardiovascular and metabolic diseases, including angina, hypertension, cholesterol, diabetes, oedema and gout. Some methods also relate osteoarthritis to this group, but Figure 2 indicates that osteoarthritis appears to form a stronger group of its own with osteoporosis and thyroid. The second group corresponds to mental health problems (anxiety, depression, mood disorder) alone. The hierarchical-clustering method with Yule Q coefficients links migraine with this group, whereas the three-step clustering method relates back pain with the mental health conditions (as back pain and depression form the closest pair) and the network and cluster analyses link disc disorder with the mental health disorder group. Previous studies107,108 showed associations between mental health disorders and pain. Another recent review found a high level of multimorbidity between mental disorders.109 The third group comprised allergic diseases, including hay fever, Sinusitis and food allergy. There is also suggestion that this group of allergic diseases may link to both bronchitis and asthma.

Previous multimorbidity studies often considered an index condition or an isolated combination of multimorbid conditions.17,110 When the prevalence of multimorbid condition combinations is not high [e.g. the most prevalent triple combination in the NHS data is (Hayfever, Sinusitis, Asthma); prevalence of 1%], the sample size for certain combinations of multimorbid conditions may not be sufficiently large to provide reliable and/or generalizable results. New approaches focus on identifying groups of individuals with different patterns of multimorbidity and thus allow impact analyses using the whole sample simultaneously. They include the use of latent class analysis,111,112 hierarchical-clustering methods,113,114 self-organizing maps,115 principal-component analysis116 and the mixture of multivariate generalized Bernoulli distributions.9

Recent reviews on multimorbidity studies12,26 recommended that future epidemiological studies should cover a broad selection of health conditions in order to avoid missing important nosological entities and enhance external validity. When many conditions are considered, clustering of individuals on the basis of morbidity data will face the issue of high-dimensional problems.117 This is particularly important when a clustering-based approach is adopted to assess the impact of multimorbidity on individuals’ health outcomes and service use. In handling high-dimensional morbidity data, grouping of multimorbid conditions provides a dimensional reduction method in the subsequent clustering of individuals.9 The three-step clustering method34 establishes a clustering-based multimorbidity score for each individual on the basis of a weighted scale in terms of the estimated probabilities of the membership of multimorbid condition groups.9 Grouping of multimorbid conditions thus offers an alternative approach to measure individual multimorbidity status for impact analyses to health outcomes. Another review118 focuses on multimorbidity in primary care and highlights the need to explore longitudinal data for evaluating which multimorbid conditions are more persistent over time119 and their role in the influence of health outcomes and service uses at different periods of time.21

The main strength of our review is the use of broad terms for building the search algorithm, which produced a list of 13 194 studies in the first round of the review, including relevant papers that had not been identified in previous reviews on multimorbidity studies. Our review has the following limitations. We included English-only articles. We may have omitted articles that considered multimorbidity with an index condition, and thus the interpretation of the identified multimorbidity patterns should be confined to the nature of co-occurrence of conditions observed in the general population. Our review focused on the analytical methods for identifying groups of multimorbid conditions and, as such, the assessment of the content of the identified patterns of multimorbidity falls outside of the scope of the present review. We used a single NHS dataset in the comparison analysis to assess the effects for heterogeneous analytical methods on the identified multimorbidity patterns.

Conclusion

Our review and comparison analysis suggest that the methodology applied to reveal the group structure of multimorbid conditions has great influence on the multimorbidity patterns found across different studies, inducing a serious challenge to interpreting or comparing these patterns. We identified and compared five analytical methods for identifying groups of multimorbid conditions. However, more work on further comparison of these methods in a wide variety of settings and populations would be helpful to guide choosing the analytical method for improved validity and generalizability of findings. Investigators should also attempt to compare results obtained by various methods for a consensus grouping of multimorbid conditions. Studies with a population-based longitudinal-study setting offer a cost-effective approach to trace patterns of multimorbidity at different times and identify which patterns have the greatest impact on health outcomes and health-service use over time. The findings will inform future research on intervention studies or trials for long-term achievements of improved care and wellbeing for people with multiple multimorbid conditions.

Funding

The work was supported by a grant from the Menzies Health Institute Queensland, Griffith University, Australia.

Acknowledgements

The authors are grateful to the Australian Bureau of Statistics for providing the Australian National Health Survey data in Confidentialised Unit Record Files (CURFs).

Conflict of interest: The authors have no conflicts of interest to declare.

References

1

van den Akker
M
,
Buntinx
F
,
Knottnerus
J
.
Comorbidity or multimorbidity: what’s in a name? A review of literature
.
Eur J Gen Pract
1996
;
2
:
65
70
.

2

Caughey
G
,
Roughead
EE
.
Multimorbidity research challenges: where to go from here?
J Comorb
2011
;
1
:
8
10
.

3

Taylor
AW
,
Price
K
,
Gill
TK
et al. .
Multimorbidity—not just an older person’s issue: results from an Australian biomedical study
.
BMC Public Health
2010
;
10
:
718
.

4

Westert
GP
,
Satariano
WA
,
Schellevis
FG
,
Van Den Bos
GAM
.
Patterns of comorbidity and the use of health services in the Dutch population
.
Eur J Public Health
2001
;
11
:
365
72
.

5

Sciberras
E
,
Westrupp
EM
,
Wake
M
et al. .
Healthcare costs associated with language difficulties up to 9 years of age: Australian population-based study
.
Int J Speech Lang Pathol
2015
;
17
:
41
52
.

6

Fenn
B
,
Morris
SS
,
Black
RE
.
Comorbidity in childhood in northern Ghana: magnitude, associated factors, and impact on mortality
.
Int J Epidemiol
2005
;
34
:
368
75
.

7

McRae
I
,
Yen
L
,
Jeon
Y-H
,
Herath
PM
,
Essue
B
.
Multimorbidity is associated with higher out-of-pocket spending: a study of older Australians with multiple chronic conditions
.
Aust J Prim Health
2013
;
19
:
144
49
.

8

Glynn
LG
,
Valderas
JM
,
Healy
P
et al. .
The prevalence of multimorbidity in primary care and its effect on health care utilization and cost
.
Fam Pract
2011
;
28
:
516
23
.

9

Ng
SK
.
A two-way clustering framework to identify disparities in multimorbidity patterns of mental and physical health conditions among Australians
.
Stat Med
2015
;
34
:
3444
60
.

10

Clark
NM
,
Lachance
L
,
Benedict
MB
et al. .
The extent and patterns of multiple chronic conditions in low-income children
.
Clin Pediatr (Phila)
2015
;
54
:
353
58
.

11

Kuo
RN
,
Lai
MS
.
The influence of socio-economic status and multimorbidity patterns on healthcare costs: a six-year follow-up under a universal healthcare system
.
Int J Equity Health
2013
;
12
:
69
.

12

Prados-Torres
A
,
Calderón-Larrañaga
A
,
Hancco-Saavedra
J
,
Poblador-Plou
B
,
Van Den Akker
M
.
Multimorbidity patterns: a systematic review
.
J Clin Epidemiol
2014
;
67
:
254
66
.

13

Moffat
K
,
Mercer
SW
.
Challenges of managing people with multimorbidity in today’s healthcare systems
.
BMC Fam Pract
2015
;
16
:
129
.

14

Fortin
M
,
Hudon
C
,
Bayliss
EA
,
van den Akker
M
.
Multimorbidity’s many challenges. Time to focus on the needs of this vulnerable and growing population
.
Br Med J
2007
;
334
:
1016
17
.

15

Aspin
C
,
Jowsey
T
,
Glasgow
N
et al. .
Health policy responses to rising rates of multi-morbid chronic illness in Australia and New Zealand
.
Aust N Z J Public Health
2010
;
34
:
386
93
.

16

Jowsey
T
,
Jeon
YH
,
Dugdale
P
,
Glasgow
NJ
,
Kljakovic
M
,
Usherwood
T
.
Challenges for co-morbid chronic illness care and policy in Australia: a qualitative study
.
Aust New Zealand Health Policy
2009
;
6
:
22
.

17

Larson
K
,
Russ
SA
,
Kahn
RS
,
Halfon
N
.
Patterns of comorbidity, functioning, and service use for US children with ADHD, 2007
.
Pediatrics
2011
;
127
:
462
70
.

18

Valderas
JM
,
Starfield
B
,
Sibbald
B
,
Salisbury
C
,
Roland
M
.
Defining comorbidity: implications for understanding health and health services
.
Ann Fam Med
2009
;
7
:
357
63
.

19

Mercer
SW
,
Guthrie
B
,
Furler
J
,
Watt
GCM
,
Hart
JT
.
Multimorbidity and the inverse care law in primary care
.
BMJ
2012
;
344
:
e4152
.

20

Marengoni
A
,
Rizzuto
D
,
Wang
HX
,
Winblad
B
,
Fratiglioni
L
.
Patterns of chronic multimorbidity in the elderly population
.
J Am Geriatr Soc
2009
;
57
:
225
30
.

21

Jindai
K
,
Nielson
CM
,
Vorderstrasse
BA
,
Quinones
AR
.
Multimorbidity and functional limitations among adults 65 or over, NHANES 2005–2012
.
Prev Chronic Dis
2016
;
13
:
160
74
.

22

Smith
SM
,
Wallace
E
,
O’Dowd
T
,
Fortin
M
.
Interventions for improving outcomes in patients with multimorbidity in primary care and community settings
.
Cochrane Database Syst Rev
2016
;
3
:
CD006560
.

23

Violán
C
,
Foguet-Boreu
Q
,
Hermosilla-Pérez
E
et al. .
Comparison of the information provided by electronic health records data and a population health survey to estimate prevalence of selected health conditions and multimorbidity
.
BMC Public Health
2013
;
13
:
251
.

24

Huntley
AL
,
Johnson
R
,
Purdy
S
,
Valderas
JM
,
Salisbury
C
.
Measures of multimorbidity and morbidity burden for use in primary care and community settings: a systematic review and guide
.
Ann Fam Med
2012
;
10
:
134
41
.

25

John
R
,
Kerby
DS
,
Hennessy
CH
.
Patterns and impact of comorbidity and multimorbidity among community-resident American Indian elders
.
Gerontologist
2003
;
43
:
649
60
.

26

Fortin
M
,
Stewart
M
,
Poitras
ME
,
Almirall
J
,
Maddocks
H
.
A systematic review of prevalence studies on multimorbidity: toward a more uniform methodology
.
Ann Fam Med
2012
;
10
:
142
51
.

27

Schneeweiss
S
,
Maclure
M
.
Use of comorbidity scores for control of confounding in studies using administrative databases
.
Int J Epidemiol
2000
;
29
:
891
98
.

28

Holden
L
,
Scuffham
PA
,
Hilton
MF
,
Muspratt
A
,
Ng
SK
,
Whiteford
HA
.
Patterns of multimorbidity in working Australians
.
Popul Health Metrics
2011
;
9
:
15
.

29

de Groot
V
,
Beckerman
H
,
Lankhorst
GJ
,
Bouter
LM
.
How to measure comorbidity: a critical review of available methods
.
J Clin Epidemiol
2003
;
56
:
221
29
.

30

Kadam
UT
,
Uttley
J
,
Jones
PW
,
Iqbal
Z
.
Chronic disease multimorbidity transitions across healthcare interfaces and associated costs: a clinical-linkage database study
.
BMJ Open
2013
;
3
:
e003109
.

31

Brettschneider
C
,
Leicht
H
,
Bickel
H
et al. .
Relative impact of multimorbid chronic conditions on health-related quality of life—results from the MultiCare Cohort Study
.
PLoS One
2013
;
8
:
e66742
.

32

Chamberlain
AM
,
Sauver
JLS
,
Gerber
Y
et al. .
Multimorbidity in heart failure: a community perspective
.
Am J Med
2015
;
128
:
38
45
.

33

Seah
JZ
,
Harris
A
,
Lorgelly
PK
.
Hospital resource use in chronic disease combinations: is it enough to just add them up?
Value Health
2013
;
16
:
A466
.

34

Ng
SK
,
Holden
L
,
Sun
J
.
Identifying comorbidity patterns of health conditions via cluster analysis of pairwise concordance statistics
.
Stat Med
2012
;
31
:
3393
405
.

35

Australian Bureau of Statistics (ABS)
.
National Health Survey 2007–08: User’s Guide, Cat. No. 4363.0.55.001
.
Canberra
:
Australian Bureau of Statistics
,
2009
.

36

World Health Organization (WHO)
.
International Classification of Diseases and Related Health Problems, 10th Revision (ICD-10)
,
2007
.

37

Australian Bureau of Statistics (ABS)
.
National Health Survey 2007–08: Summary of Results (Reissue), Cat. No. 4364.0
.
Canberra
:
Australian Bureau of Statistics
,
2009
.

38

Australian Bureau of Statistics (ABS)
.
National Health Survey 2007–08, Basic and Expanded Confidentialised Unit Record Files (Reissue), Cat. No. 4324.0
.
Canberra
:
Australian Bureau of Statistics
,
2010
.

39

Kubinger
KD
.
On artificial results due to using factor analysis for dichotomous variables
.
Psychol Sci
2003
;
45
:
106
10
.

40

Johnson
R
,
Wichern
DW
.
Applied Multivariate Statistical Analysis
.
Saddle River, NJ
:
Prentice hall
,
2002
.

41

Rencher
AC
.
Methods of Multivariate Analysis
.
New York
:
John Wiley & Sons
,
2003
.

42

Comrey
AL
.
The minimum residual method of factor analysis
.
Psychol Rep
1962
;
11
:
15
18
.

43

Patil
VH
,
Singh
S
,
Mishra
S
,
Donavan
DT
.
Efficient theory development and factor retention criteria: abandon the ‘eigenvalue greater than one’ criterion
.
J Bus Res
2008
;
61
:
162
70
.

44

Ruscio
J
,
Roche
B
.
Determining the number of factors to retain in an exploratory factor analysis using comparison data of known factorial structure
.
Psychol Assess
2012
;
24
:
282
92
.

45

Treiblmaier
H
,
Filzmoser
P
.
Exploratory factor analysis revisited: how robust methods support the detection of hidden multivariate data structures in IS research
.
Inf Manag
2010
;
47
:
197
207
.

46

Hartigan
JA
.
Clustering Algorithms
.
New York, NY
:
Wiley
,
1975
.

47

Lance
GN
,
Williams
WT
.
A generalized theory of classificatory sorting strategies: I. Hierarchical systems
.
Comput J
1967
;
9
:
373
80
.

48

Benjamini
Y
,
Hochberg
Y
.
Controlling the false discovery rate: a practical and powerful approach to multiple testing
.
J R Stat Soc B
1995
;
57
:
289
300
.

49

Borgatti
S
,
Everett
M
,
Freeman
L
.
Ucinet for Windows: Software for Social Network Analysis
.
Harvard, MA
:
Analytic Technologies
. Available at: http://www.analytictech.com.

50

Greenacre
M
,
Blasius
J
.
Multiple Correspondence Analysis and Related Methods
.
Boca Raton
:
CRC Press
,
2006
.

51

Sourial
N
,
Wolfson
C
,
Zhu
B
et al. .
Correspondence analysis is a useful tool to uncover the relationships among categorical variables
.
J Clin Epidemiol
2010
;
63
:
638
46
.

52

Greenacre
M
.
Correspondence Analysis in Practice
.
New York
:
CRC Press
,
2007
.

53

Nenadić
O
,
Greenacre
M
.
Correspondence analysis in R, with two- and three-dimensional graphics: the ca package
.
J Stat Softw
2007
;
20
:
163
70
.

54

Held
FP
,
Blyth
F
,
Gnjidic
D
et al. .
Association rules analysis of comorbidity and multimorbidity: the concord health and aging in men project
.
J Gerontol A Biol Sci Med Sci
2016
;
71
:
625
31
.

55

Jacomy
M
,
Venturini
T
,
Heymann
S
,
Bastian
M
.
ForceAtlas2, a continuous graph layout algorithm for handy network visualization designed for the Gephi software
.
PLoS One
2014
;
9
:
e98679
.

56

Blondel
V
,
Guillaume
J-L
,
Lambiotte
R
,
Lefebvre
E
.
Fast unfolding of communities in large networks
.
J Stat Mech
2008
;
2008
:
P10008
.

57

Lambiotte
R
,
Delvenne
JC
,
Barahona
M
.
Laplacian dynamics and multiscale modular structure in networks
.
Preprint arXiv: 0812.1770v3
,
2009
.

58

Abad-Díez
JM
,
Calderón-Larrañaga
A
,
Poncel-Falcó
A
et al. .
Age and gender differences in the prevalence and patterns of multimorbidity in the older population
.
BMC Geriatr
2014
;
14
:
75
.

59

Alonso-Moran
E
,
Orueta
JF
,
Esteban
JI
et al. .
Multimorbidity in people with type 2 diabetes in the Basque Country (Spain): prevalence, comorbidity clusters and comparison with other chronic patients
.
Eur J Intern Med
2015
;
26
:
197
202
.

60

Clerencia-Sierra
M
,
Calderón-Larrañaga
A
,
Martínez-Velilla
N
et al. .
Multimorbidity patterns in hospitalized older patients: associations among chronic diseases and geriatric syndromes
.
PLoS One
2015
;
10
:
e0132909
.

61

Cornell
JE
,
Pugh
JA
,
Williams
JW
et al. .
Multimorbidity clusters: clustering binary data from multimorbidity clusters: clustering binary data from a large administrative medical database
.
AMR
2007
;
12
:
163
82
.

62

Dong
HJ
,
Wressle
E
,
Marcusson
J
.
Multimorbidity patterns of and use of health services by Swedish 85-year-olds: an exploratory study
.
BMC Geriatr
2013
;
13
:
120
.

63

Dorenkamp
S
,
Mesters
I
,
Schepers
J
et al. .
Disease combinations associated with physical activity identified: the SMILE cohort study
.
Biomed Res Int
2016
;
2016
:
1
.

64

Foguet-Boreu
Q
,
Violán
C
,
Rodriguez-Blanco
T
et al. .
Multimorbidity patterns in elderly primary health care patients in a South Mediterranean European region: a cluster analysis
.
PLoS One
2015
;
10
:
e0141155
.

65

Formiga
F
,
Ferrer
A
,
Sanz
H
,
Marengoni
A
,
Alburquerque
J
,
Pujol
R
.
Patterns of comorbidity and multimorbidity in the oldest old: the Octabaix study
.
Eur J Intern Med
2013
;
24
:
40
44
.

66

Gabilondo
A
,
Alonso-Moran
E
,
Nuño-Solinis
R
,
Orueta
JF
,
Iruin
A
.
Comorbidities with chronic physical conditions and gender profiles of illness in schizophrenia: results from PREST, a new health dataset
.
J Psychosom Res
2017
;
93
:
102
109
.

67

García-Olmos
L
,
Salvador
CH
,
Alberquilla
A
et al. .
Comorbidity patterns in patients with chronic diseases in general practice
.
PLoS One
2012
;
7
:
e32141
.

68

Garin
N
,
Olaya
B
,
Perales
J
et al. .
Multimorbidity patterns in a national representative sample of the Spanish adult population
.
PLoS One
2014
;
9
:
e84794
.

69

Garin
N
,
Koyanagi
A
,
Chatterji
S
et al. .
Global multimorbidity patterns: a cross-sectional, population-based, multi-country study
.
J Gerontol A Biol Sci Med Sci
2016
;
71
:
205
14
.

70

Goldstein
G
,
Luther
JF
,
Haas
GL
,
Appelt
CJ
,
Gordon
AJ
.
Factor structure and risk factors for the health status of homeless veterans
.
Psychiatr Q
2010
;
81
:
311
23
.

71

Gu
J
,
Chao
J
,
Chen
W
et al. .
Multimorbidity in the community-dwelling elderly in urban China
.
Arch Gerontol Geriatr
2017
;
68
:
62
67
.

72

Hermans
H
,
Evenhuis
HM
.
Multimorbidity in older adults with intellectual disabilities
.
Res Dev Disabil
2014
;
35
:
776
83
.

73

Herr
JK
,
Salyer
J
,
Flattery
M
et al. .
Heart failure symptom clusters and functional status—a cross-sectional study
.
J Adv Nurs
2015
;
71
:
1274
87
.

74

Islam
MM
,
Valderas
JM
,
Yen
L
,
Dawda
P
,
Jowsey
T
,
McRae
IS
.
Multimorbidity and comorbidity of chronic diseases among the senior Australians: prevalence and patterns
.
PLoS One
2014
;
9
:
e83783
.

75

Jackson
CA
,
Jones
M
,
Tooth
L
,
Mishra
GD
,
Byles
J
,
Dobson
A
.
Multimorbidity patterns are differentially associated with functional ability and decline in a longitudinal cohort of older women
.
Age Ageing
2015
;
44
:
810
16
.

76

Jackson
CA
,
Dobson
AJ
,
Tooth
LR
,
Mishra
GD
.
Lifestyle and socioeconomic determinants of multimorbidity patterns among mid-aged women: a longitudinal study
.
PLoS One
2016
;
11
:
e0156804
.

77

Jovic
D
,
Vukovic
D
,
Marinkovic
J
.
Prevalence and patterns of multi-morbidity in Serbian adults: a cross-sectional study
.
PLoS One
2016
;
11
:
e0148646
.

78

Kim
DJ
,
Westfall
AO
,
Chamot
E
et al. .
Multimorbidity patterns in HIV-infected patients: the role of obesity in chronic disease clustering
.
J Acquir Immune Defic Syndr
2012
;
61
:
600
05
.

79

Kirchberger
I
,
Meisinger
C
,
Heier
M
et al. .
Patterns of multimorbidity in the aged population. results from the KORA-Age study
.
PLoS One
2012
;
7
:
e30556
.

80

Kumar
RG
,
Juengst
SB
,
Wang
Z
et al. .
Epidemiology of comorbid conditions among adults 50 years and older with traumatic brain injury
.
J Head Trauma Rehabil
2018
;
33
:
15
24
.

81

Magnan
EM
,
Bolt
DM
,
Greenlee
RT
,
Fink
J
,
Smith
MA
.
Stratifying patients with diabetes into clinically relevant groups by combination of chronic conditions to identify gaps in quality of care
.
Health Serv Res
2018
;
53
:
450
68
.

82

Marengoni
A
,
Bonometti
F
,
Nobili
A
et al. .
In-hospital death and adverse clinical events in elderly patients according to disease clustering: the REPOSI study
.
Rejuvenation Res
2010
;
13
:
469
77
.

83

Marengoni
A
,
Nobili
A
,
Pirali
C
et al. .
Comparison of disease clusters in two elderly populations hospitalized in 2008 and 2010
.
Gerontology
2013
;
59
:
307
15
.

84

Nurnberg
HG
,
Raskin
M
,
Levine
PE
,
Pollack
S
,
Siegel
O
,
Prince
R
.
The comorbidity of borderline personality disorder and other DSM-III-R axis II personality disorders
.
Am J Psychiatry
1991
;
148
:
1371
77
.

85

Poblador-Plou
B
,
Van Den Akker
M
,
Vos
R
,
Calderón-Larrañaga
A
,
Metsemakers
J
,
Prados-Torres
A
.
Similar multimorbidity patterns in primary care patients from two European regions: results of a factor analysis
.
PLoS One
2014
;
9
:
e100375
.

86

Prados-Torres
A
,
Poblador-Plou
B
,
Calderón-Larrañaga
A
et al. .
Multimorbidity patterns in primary care: interactions among chronic diseases using factor analysis
.
PLoS One
2012
;
7
:
e32190
.

87

Prazeres
F
,
Santiago
L
.
Prevalence of multimorbidity in the adult population attending primary care in Portugal: a cross-sectional study
.
BMJ Open
2015
;
5
:
e009287
.

88

Ruiz
M
,
Bottle
A
,
Long
S
,
Aylin
P
.
Multi-morbidity in hospitalised older patients: who are the complex elderly?
PLoS One
2015
;
10
:
e0145372
.

89

Schäfer
I
,
von Leitner
EC
,
Schön
G
et al. .
Multimorbidity patterns in the elderly: a new approach of disease clustering identifies complex interrelations between chronic conditions
.
PLoS One
2010
;
5
:
e15941
.

90

Sideris
C
,
Pourhomayoun
M
,
Kalantarian
H
,
Sarrafzadeh
M
.
A flexible data-driven comorbidity feature extraction framework
.
Comput Biol Med
2016
;
73
:
165
72
.

91

Vu
T
,
Finch
CF
,
Day
L
.
Patterns of comorbidity in community-dwelling older people hospitalised for fall-related injury: a cluster analysis
.
BMC Geriatr
2011
;
11
:
45
.

92

Walker
V
,
Perret-Guillaume
C
,
Kesse-Guyot
E
et al. .
Effect of multimorbidity on health-related quality of life in adults aged 55 years or older: results from the SU.VI.MAX 2 cohort
.
PLoS One
2016
;
11
:
e0169282
.

93

Wang
R
,
Yan
Z
,
Liang
Y
et al. .
Prevalence and patterns of chronic disease pairs and multimorbidity among older Chinese adults living in a rural area
.
PLoS One
2015
;
10
:
e0138521
.

94

Kamphaus
RW
.
Clinical Assessment of Child and Adolescent Intelligence
.
New York
:
Springer Science & Business Media
,
2005
.

95

Byrne
B
.
Structural Equation Modeling with AMOS: Basic Concepts, Applications and Programming
.
New York, NY
:
Routledge
,
2009
.

96

Blanco
C
,
Rubio
JM
,
Wall
M
,
Secades-Villa
R
,
Beesdo-Baum
K
,
Wang
S
.
The latent structure and comorbidity patterns of generalized anxiety disorder and major depressive disorder: a national study
.
Depress Anxiety
2014
;
31
:
214
22
.

97

Chaowattanapanit
S
,
Choonhakarn
C
,
Chetchotisakd
P
,
Sawanyawisuth
K
,
Julanon
N
.
Clinical features and outcomes of Sweet’s syndrome associated with non-tuberculous mycobacterial infection and other associated diseases
.
J Dermatol
2016
;
43
:
532
36
.

98

Chubachi
S
,
Sato
M
,
Kameyama
N
et al. .
Identification of five clusters of comorbidities in a longitudinal Japanese chronic obstructive pulmonary disease cohort
.
Respir Med
2016
;
117
:
272
79
.

99

Lochner
C
,
Hemmings
SM
,
Kinnear
CJ
et al. .
Cluster analysis of obsessive-compulsive spectrum disorders in patients with obsessive-compulsive disorder: clinical and genetic correlates
.
Compr Psychiatry
2005
;
46
:
14
19
.

100

Moser
DK
,
Lee
KS
,
Wu
JR
et al. .
Identification of symptom clusters among patients with heart failure: an international observational study
.
Int J Nurs Stud
2014
;
51
:
1366
72
.

101

Newcomer
SR
,
Steiner
JF
,
Bayliss
EA
.
Identifying subgroups of complex patients with cluster analysis
.
Am J Manag Care
2011
;
17
:
e324
.

102

Reynolds
WS
,
Zhang
X
,
Dmochowski
R
,
Bruehl
S
.
Co-morbidity with chronic pain conditions in women with OAB is associated with greater urinary symptom burden
.
Neurourol Urodyn
2016
;
35
:
S40
.

103

Goldstein
G
,
Luther
JE
,
Jacoby
AM
,
Haas
GL
,
Gordon
AJ
.
A taxonomy of medical comorbidity for veterans who are homeless
.
J Health Care Poor Underserved
2008
;
19
:
991
1005
.

104

Gomez-Rubio
P
,
Scarpa
A
,
Sharp
L
et al. .
Patterns of comorbidity and multimorbidity in pancreatic cancer patients
.
Pancreatology
2015
;
15
:
S125
.

105

Ubalde-Lopez
M
,
Gimeno
D
,
Delclos
G
,
Calvo-Bonacho
E
,
Benavides
FG
.
A holistic approach to calculating a multimorbidity score: the usefulness of multi-correspondence analysis
.
Occup Environ Med
2014
;
71
:
A1
.

106

Christoffersson
A
.
Factor analysis of dichotomized variables
.
Psychometrika
1975
;
40
:
5
32
.

107

Clauw
DJ
.
Fibromyalgia: an overview
.
Am J Med
2009
;
122
:
S3
13
.

108

Goldenberg
D
.
The interface of pain and mood disturbances in the rheumatic diseases
.
Semin Arthritis Rheum
2010
;
40
:
15
31
.

109

Steel
Z
,
Marnane
C
,
Iranpour
C
et al. .
The global prevalence of common mental disorders: a systematic review and meta-analysis 1980–2013
.
Int J Epidemiol
2014
;
43
:
476
93
.

110

Pillay
M
,
Dennis
S
,
Harris
MF
.
Quality of care measures in multimorbidity
.
Aust Fam Physician
2014
;
43
:
132
36
.

111

D’Agostino
NM
,
Edelstein
K
,
Zhang
N
et al. .
Comorbid symptoms of emotional distress in adult survivors of childhood cancer
.
Cancer
2016
;
122
:
3215
24
.

112

Whitson
HE
,
Johnson
KS
,
Sloane
R
et al. .
Identifying patterns of multimorbidity in older Americans: application of latent class analysis
.
J Am Geriatr Soc
2016
;
64
:
1668
73
.

113

Collerton
J
,
Jagger
C
,
Yadegarfar
ME
et al. .
Deconstructing complex multimorbidity in the very old: findings from the Newcastle 85+ study
.
Biomed Res Int
2016
;
2016
:
1
.

114

Lacedonia
D
,
Carpagnano
GE
,
Sabato
R
et al. .
Characterization of obstructive sleep apnea-hypopnea syndrome (OSA) population by means of cluster analysis
.
J Sleep Res
2016
;
25
:
724
30
.

115

Vanfleteren
LEGW
,
Spruit
MA
,
Groenen
M
et al. .
Clusters of comorbidities based on validated objective measurements and systemic inflammation in patients with chronic obstructive pulmonary disease
.
Am J Respir Crit Care Med
2013
;
187
:
728
35
.

116

Vavougios
GD
,
Natsios
G
,
Pastaka
C
,
Zarogiannis
SG
,
Gourgoulianis
KI
.
Phenotypes of comorbidity in OSAS patients: combining categorical principal component analysis with cluster analysis
.
J Sleep Res
2016
;
25
:
31
38
.

117

McLachlan
G
,
Ng
S
,
Wang
K
.
Clustering of high-dimensional and correlated data
. In:
Lauro
C
,
Palumbo
F
,
Greenacre
M
(eds).
Studies in Classification, Data Analysis, and Knowledge Organization: Data Analysis and Classification
.
Berlin
:
Springer-Verlag
,
2010
.

118

France
EF
,
Wyke
S
,
Gunn
JM
,
Mair
FS
,
McLean
G
,
Mercer
SW
.
Multimorbidity in primary care: a systematic review of prospective cohort studies
.
Br J Gen Pract
2012
;
62
:
e297
307
.

119

Wake
M
,
Hardy
P
,
Sawyer
MG
,
Carlin
JB
.
Comorbidities of overweight/obesity in Australian pre-schoolers: a cross-sectional population study
.
Arch Dis Child
2008
;
93
:
502
507
.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)

Supplementary data