Abstract

Background : Demographic change influences not only the terms of health care, but also its financing. Hence, prevention is becoming a more important key to facing upcoming challenges. Aim of this study was to identify predictors for future high-cost patients and derive implications for potential starting points for prevention. Methods : Claims data from a German statutory health insurance agency were used. High-cost patients were defined as the 10% most expensive persons to insure in 2011. The predictors stemmed from the previous year. Logistic regression with stepwise forward selection for 10 sex- and age-specific subgroups was performed. Model fit was assessed by Nagelkerke’s R-squared value. Results : Model fit values indicated well-suited models that yielded better results among younger age-groups. Identified predictors can be summarized as different sets of variables that mostly pertain to diseases. Some are rather broad and include different disorders, like the set of mental/behavioural disorders including depression and schizophrenia; other sets of variables are more homogenous, such as metabolic diseases, with diabetes mellitus (DM) being the dominant member of every subgroup. Conclusion : Because diabetes was a significant predictor for future high-cost patients in all analysed subgroups, it should be considered as a potential starting point for prevention. The disease is specific enough to allow for the implementation of effective prevention strategies, and it is possible to intervene, even in patients already affected by DM. Furthermore, the monetary savings potential is probably high because the long-term complications of DM are expensive to treat and affect a large part of the population.

Introduction

Most industrialized countries have to deal with the outcomes of demographic changes that affect many social sectors. 1 The health care sector, for example, has to deal with increasing life expectancy, which changes the dynamics of dominant diseases and leads to more people surviving longer with chronic diseases and suffering from multi-morbidity. 2 These developments not only affect future demands on health care, but also its long-term financing. Health care costs are increasing due to the growing size of the older population, whilst at the same time fewer people of working age are available to fund these changing needs because of decreasing birth rates. To address future financial challenges properly, it is necessary to adapt the health care system in terms of its organization, emphasis and collaboration among relevant actors, such as physicians, hospitals, insurance companies and policy makers. 3

Like many other countries, Germany’s health care system still focuses on acute care while over-emphasizng the inpatient sector. 4 However, more and more stakeholders realize that their view has to expand to include the whole range of health care processes, from prevention to rehabilitation. In particular, the prevention of diseases with high financial impact seems to be a highly promising approach. In light of increasing life expectancies, it should be a main objective to at least delay the onset of diseases or to avoid their advanced stages, if not to prevent them completely. In order to know where best to start prevention interventions to achieve the greatest effects in the future, one must first examine the distribution of health care costs.

Nearly 87% of the German population is insured in the statutory health insurance (SHI) system, which covers the bulk of their health care expenses. 5 An analysis of the cost distribution in the SHI system revealed that 10% of the insured account for 80% of the health care expenditures. 6 With older people dominating this group, it is likely that demographic changes will further increase expenses. A similar pattern of cost distribution can be found in many other industrialized countries. 7 Thus, this small group of high-cost patients would be a reasonable target for starting prevention measures. Studies describe high-cost patients and frequent attenders (who consequently accumulate high expenses) as less likely to be married 8 and having a lesser locus of control. 9 They tend to suffer more often from mental diseases like depression 10 and somatisation, 11 and are more likely to have diabetes, cardiac diseases and asthma or chronic obstructive pulmonary disease (COPD). 7,12 Chronic diseases and multi-morbidity are common among those in these groups, 9,13 and their self-reported health status is worse than that of others. 10,13 These findings imply that high-cost patients differ from others and that they might have specific needs that could be addressed with prevention programs. In contrast to previous cross-sectional studies, this study aimed to identify predictors for high-cost patients emerging the year prior to the high-cost episode in order to draw conclusions regarding effective starting points for prevention.

Methods

Data source

The present analysis was performed using data from the years 2010 and 2011 provided by AOK Lower Saxony, Germany’s 10th-largest statutory health insurer, with currently 2.4 million insured members. The SHI is a public insurance system. Based on pay-as-you-go financing, all insured persons are entitled to the same services and treatments. The SHI data contain basic information on the insured persons like sex, age and insurance status as well as all information on costs, diagnoses and treatments (for the billing system) for the following areas: in- and outpatient care, sickness benefits and rehabilitation, home nursing, ambulatory drug supply and prescribed therapeutic appliances and remedies. Therefore, the data provide a comprehensive overview of the health care claims of the insured persons.

Definition of high-cost patients

High-cost patients were defined as the top 10% most expensive insured persons in 2011 based on the overall costs. The high-cost group was compared to a randomized sample of 10% of the remaining population. Each group consisted of ∼200 000 insured persons. Relative definitions with a clearly defined cut-off point have the advantage of comparability, whereas absolute and metric approaches are less suitable for comparison. 14,15

Formation of subgroups

Previous studies suggested that performing separate analyses for sex- and age-specific subgroups yields more suitable results. 15,16 Because there is no consensus on how to arrange these groups, age limits were defined based on the present data. Therefore, suitable variables that distinguish between ages were detected first. Minors below the age of 18 were excluded.

Identification of potential predictors

All predictors from a literature search where considered as potential predictors for the analysis whenever operationalization of the insurance data was possible. In addition, exploratory analysis was performed to find more predictors in order to take the specific nature of the data source into account. Predictors were selected for further analysis based on their overall frequencies in the high-cost patient group as well as by comparison with the control group. Thus, diseases that were very expensive but rather rare (e.g. haemophilia) were omitted.

Diagnoses were defined using the German version of the International Classification of Diseases (ICD-10-GM), 17 and drugs were named using the Anatomical Therapeutic Chemical Classification system (ATC). 18 Diagnoses and drugs were characterized as single-digit ICD or ATC codes representing larger classes of diseases, or as three-digit codes corresponding to more specific entities. ICD and ATC codes were analysed either as metric or dichotomous variables. Variables that count the occurrence of different ICD codes and multiple occurrence of the same code during the investigated year were also included to identify multi-morbidity and chronic conditions. Metric overall costs and dichotomous information on whether or not costs in specific health care sectors occurred, as well as the need for home nursing and different insurance statuses were also analysed.

Due to the high number of potential predictors, bivariate correlation analyses were performed with each potential predictor and the outcome variable. Significant predictors with phi-coefficients >0.1 were further considered.

Regression models

Because of the large number of potential predictors, partial models were used to detect the most important and influential variables for each subgroup. Separate logistic regressions with stepwise forward selection were thus performed for predictors identified in the literature and in our exploratory analysis. The selected predictors of the partial models were then entered in the overall model and one final regression model was performed for each subgroup. The dependent variable of each regression was the dichotomous information on whether an insuree was a high-cost patient in 2011. The explanatory variables stemmed from the previous year (2010).

All models were built by using logistic regression with stepwise forward selection. Significance levels were set at P = 0.05, and Nagelkerke’s R-squared was used for analysis of model fit. In accordance with Backhaus et al., 19 values higher than 20% are acceptable, those higher than 40% are good and values higher than 50% indicate very good model fit. Data management and statistical analyses were performed using IBM SPSS Statistics 20.

Results

Formation of subgroups

Sex- and age-specific subgroups were defined using the aforementioned data. The distribution of costs in different health care sectors and the distribution of the main diagnoses in the inpatient sector were chosen as suitable variables to distinguish between ages. The age limits were set so as to obtain homogenous subgroups with similar limits for men and women and approximately equal age spans. Five male and female age groups were defined: 18–34 years, 35–49 years, 50–64 years, 65–79 years and 80 years and older. 20 The subgroups differed in size and ratio of high-cost patients to control group as the high-cost patients did not follow the age distribution of the remaining SHI-population ( figure 1 ).

Age distribution of high-cost patients and the remaining population by sex
Figure 1

Age distribution of high-cost patients and the remaining population by sex

Identification of potential predictors

Not all potential predictors from the literature could be operationalized with our data. No locus of control or self-reported health data were available, and information on marital status was not valid in every case. Other predictors had to be translated into ICD codes, which are far more specific than self-reported information on diagnoses.

The majority of the predictors from the exploratory analysis refer to information on diagnoses or prescription drugs. However, multi-morbidity, chronic conditions, cost information, the need for home nursing and insurance status were identified as potential predictors.

Regression models

Logistic regression models were performed for 10 subgroups. The model fit values of the 10 final regression models ranged from 21.1% (acceptable) to 46.0% (good), indicating that all of the models were well-suited to predict high-cost patients. The highest values for both sexes were found in the 35–49 year age group (46.0% for men and 45.3% for women). Although the R-squared value for the youngest group was almost the same among men (45.3%), the explained variation in young women was comparatively low (33.1%). In the older age groups, the model fit values decrease and are lowest in the oldest age groups for both sexes.

Tables 1 and 2 show the results of the logistic regression models for men and women by age group. Although there are many different variables in the tables, the predictors can be structured in different sets of content-related variables.

Table 1

Results of stepwise forward regression for men of all age groups

Age group
Predictor (2010)18–34 Odds ratio35–49 Odds ratio50–64 Odds ratio65–79 Odds ratio80+ Odds ratio
Insurance status
    Voluntarily insured a1.427
    With dependant coverage a0.810
    Unemployed a0.677
    Retired a1.5091.251
    Medical or employment measurement a2.0301.9552.215
    Application a0.7750.791
Costs
    Overall (transformed) b1.2731.2151.1081.0841.079
    Travelling a1.3731.4391.2851.2251.386
    Remedies a1.2181.133
    Therapeutic appliances a1.2331.3011.2201.2771.159
    Inpatient a1.2611.336
    Sickness benefit a0.490
Nursing
    Home nursing costs a2.7482.7212.2032.213
    Nursing level a2.3221.4721.6381.418
Chronic/multi-morbidity
    Different ICD-codes (3-digit) b1.0301.0271.011
    Different ICD-codes (1-digit) b1.037
    Chronic diseases a1.6251.7631.214
Mental/behavioural system
    ICD F: Mental/behavioural disorders 1.336 b,c 1.124 a,d
    ICD F20: Schizophrenia 2.446 a,e 1.950 a,e
    ICD F19: multiple drug use/other psychoactive substances 3.071 a,e
    ICD F10: use of alcohol 2.071 a,f 2.267 a,e 1.358 a,e
    ATC N: Nervous system 1.659 a 1.39 a 1.351 a 1.020 b
    ATC N05: Psycholeptics 2.157 a
    ATC N06: Psychoanaleptics 2.218 a
Musculoskeletal system
    ICD M: Diseases of the musculoskeletal system 1.030 b,e 0.544 a,g
    ICD M51: Other intervertebral disc disorders 1.356 a,e
    ATC M: Musculo-skeletal system 1.170 a 1.164 a 1.129 a
    Remedial gymnastics 2.489 a 1.019 b 1.068 b
Respiratory system
    ICD J44-J46:other COPD/Asthma 0.772 a,e 1.023 b,d 1.259 a,e 1.707 a,c
2.114 a,c 1.331 a,f
    ATC J: General antiinfectives for systemic use 1.048 a
Cardiovascular system
    ICD I: Diseases of the circulatory system 0.938 b,f 0.955 b,f
0.474 a,g
    ICD I25: Chronic ischaemic heart disease 1.213 a,e 1.186 a,e
    ICD I10-I15: Hypertensive diseases 0.854 a,f
    ICD I50: Heart failure 1.134 a,d
    ATC C: Cardiovascular system 1.144 a
    ATC: C03: Diuretics 1.289 a 1.249 a
    ATC C09: Agents acting on the renin-angiotensin system 1.472 a 1.140 a
Metabolic system
    ICD E10-E14: Diabetes mellitus 2.142 a,d 1.236 a,f 1.010 b,d
    ATC A: Alimentary tract and metabolism 1.231 a 1.254 a 1.254 a 1.034 b
    Remedy: podiatry 1.551 a
Other
    ATC B: Blood and blood forming organs 1.231 a 1.148 a
    ICD N: Diseases of the genitourinary system 1.112 a,e
    ICD C: Neoplasms 1.772 a,c
Constant0.0690.1350.2610.5170.780
    Nagelkerkes R 20.4530.4600.3670.2890.211
Age group
Predictor (2010)18–34 Odds ratio35–49 Odds ratio50–64 Odds ratio65–79 Odds ratio80+ Odds ratio
Insurance status
    Voluntarily insured a1.427
    With dependant coverage a0.810
    Unemployed a0.677
    Retired a1.5091.251
    Medical or employment measurement a2.0301.9552.215
    Application a0.7750.791
Costs
    Overall (transformed) b1.2731.2151.1081.0841.079
    Travelling a1.3731.4391.2851.2251.386
    Remedies a1.2181.133
    Therapeutic appliances a1.2331.3011.2201.2771.159
    Inpatient a1.2611.336
    Sickness benefit a0.490
Nursing
    Home nursing costs a2.7482.7212.2032.213
    Nursing level a2.3221.4721.6381.418
Chronic/multi-morbidity
    Different ICD-codes (3-digit) b1.0301.0271.011
    Different ICD-codes (1-digit) b1.037
    Chronic diseases a1.6251.7631.214
Mental/behavioural system
    ICD F: Mental/behavioural disorders 1.336 b,c 1.124 a,d
    ICD F20: Schizophrenia 2.446 a,e 1.950 a,e
    ICD F19: multiple drug use/other psychoactive substances 3.071 a,e
    ICD F10: use of alcohol 2.071 a,f 2.267 a,e 1.358 a,e
    ATC N: Nervous system 1.659 a 1.39 a 1.351 a 1.020 b
    ATC N05: Psycholeptics 2.157 a
    ATC N06: Psychoanaleptics 2.218 a
Musculoskeletal system
    ICD M: Diseases of the musculoskeletal system 1.030 b,e 0.544 a,g
    ICD M51: Other intervertebral disc disorders 1.356 a,e
    ATC M: Musculo-skeletal system 1.170 a 1.164 a 1.129 a
    Remedial gymnastics 2.489 a 1.019 b 1.068 b
Respiratory system
    ICD J44-J46:other COPD/Asthma 0.772 a,e 1.023 b,d 1.259 a,e 1.707 a,c
2.114 a,c 1.331 a,f
    ATC J: General antiinfectives for systemic use 1.048 a
Cardiovascular system
    ICD I: Diseases of the circulatory system 0.938 b,f 0.955 b,f
0.474 a,g
    ICD I25: Chronic ischaemic heart disease 1.213 a,e 1.186 a,e
    ICD I10-I15: Hypertensive diseases 0.854 a,f
    ICD I50: Heart failure 1.134 a,d
    ATC C: Cardiovascular system 1.144 a
    ATC: C03: Diuretics 1.289 a 1.249 a
    ATC C09: Agents acting on the renin-angiotensin system 1.472 a 1.140 a
Metabolic system
    ICD E10-E14: Diabetes mellitus 2.142 a,d 1.236 a,f 1.010 b,d
    ATC A: Alimentary tract and metabolism 1.231 a 1.254 a 1.254 a 1.034 b
    Remedy: podiatry 1.551 a
Other
    ATC B: Blood and blood forming organs 1.231 a 1.148 a
    ICD N: Diseases of the genitourinary system 1.112 a,e
    ICD C: Neoplasms 1.772 a,c
Constant0.0690.1350.2610.5170.780
    Nagelkerkes R 20.4530.4600.3670.2890.211

Dependant variable is high-cost-patient in 2011 yes/no; no is reference category.

All predictors are significant with P < 0.05.

a Yes/no; no is reference category.

b Metric/number.

c Inpatient (main diagnosis).

d Overall.

e Outpatient.

f Inpatient (secondary diagnosis).

g Sickness benefit.

Table 1

Results of stepwise forward regression for men of all age groups

Age group
Predictor (2010)18–34 Odds ratio35–49 Odds ratio50–64 Odds ratio65–79 Odds ratio80+ Odds ratio
Insurance status
    Voluntarily insured a1.427
    With dependant coverage a0.810
    Unemployed a0.677
    Retired a1.5091.251
    Medical or employment measurement a2.0301.9552.215
    Application a0.7750.791
Costs
    Overall (transformed) b1.2731.2151.1081.0841.079
    Travelling a1.3731.4391.2851.2251.386
    Remedies a1.2181.133
    Therapeutic appliances a1.2331.3011.2201.2771.159
    Inpatient a1.2611.336
    Sickness benefit a0.490
Nursing
    Home nursing costs a2.7482.7212.2032.213
    Nursing level a2.3221.4721.6381.418
Chronic/multi-morbidity
    Different ICD-codes (3-digit) b1.0301.0271.011
    Different ICD-codes (1-digit) b1.037
    Chronic diseases a1.6251.7631.214
Mental/behavioural system
    ICD F: Mental/behavioural disorders 1.336 b,c 1.124 a,d
    ICD F20: Schizophrenia 2.446 a,e 1.950 a,e
    ICD F19: multiple drug use/other psychoactive substances 3.071 a,e
    ICD F10: use of alcohol 2.071 a,f 2.267 a,e 1.358 a,e
    ATC N: Nervous system 1.659 a 1.39 a 1.351 a 1.020 b
    ATC N05: Psycholeptics 2.157 a
    ATC N06: Psychoanaleptics 2.218 a
Musculoskeletal system
    ICD M: Diseases of the musculoskeletal system 1.030 b,e 0.544 a,g
    ICD M51: Other intervertebral disc disorders 1.356 a,e
    ATC M: Musculo-skeletal system 1.170 a 1.164 a 1.129 a
    Remedial gymnastics 2.489 a 1.019 b 1.068 b
Respiratory system
    ICD J44-J46:other COPD/Asthma 0.772 a,e 1.023 b,d 1.259 a,e 1.707 a,c
2.114 a,c 1.331 a,f
    ATC J: General antiinfectives for systemic use 1.048 a
Cardiovascular system
    ICD I: Diseases of the circulatory system 0.938 b,f 0.955 b,f
0.474 a,g
    ICD I25: Chronic ischaemic heart disease 1.213 a,e 1.186 a,e
    ICD I10-I15: Hypertensive diseases 0.854 a,f
    ICD I50: Heart failure 1.134 a,d
    ATC C: Cardiovascular system 1.144 a
    ATC: C03: Diuretics 1.289 a 1.249 a
    ATC C09: Agents acting on the renin-angiotensin system 1.472 a 1.140 a
Metabolic system
    ICD E10-E14: Diabetes mellitus 2.142 a,d 1.236 a,f 1.010 b,d
    ATC A: Alimentary tract and metabolism 1.231 a 1.254 a 1.254 a 1.034 b
    Remedy: podiatry 1.551 a
Other
    ATC B: Blood and blood forming organs 1.231 a 1.148 a
    ICD N: Diseases of the genitourinary system 1.112 a,e
    ICD C: Neoplasms 1.772 a,c
Constant0.0690.1350.2610.5170.780
    Nagelkerkes R 20.4530.4600.3670.2890.211
Age group
Predictor (2010)18–34 Odds ratio35–49 Odds ratio50–64 Odds ratio65–79 Odds ratio80+ Odds ratio
Insurance status
    Voluntarily insured a1.427
    With dependant coverage a0.810
    Unemployed a0.677
    Retired a1.5091.251
    Medical or employment measurement a2.0301.9552.215
    Application a0.7750.791
Costs
    Overall (transformed) b1.2731.2151.1081.0841.079
    Travelling a1.3731.4391.2851.2251.386
    Remedies a1.2181.133
    Therapeutic appliances a1.2331.3011.2201.2771.159
    Inpatient a1.2611.336
    Sickness benefit a0.490
Nursing
    Home nursing costs a2.7482.7212.2032.213
    Nursing level a2.3221.4721.6381.418
Chronic/multi-morbidity
    Different ICD-codes (3-digit) b1.0301.0271.011
    Different ICD-codes (1-digit) b1.037
    Chronic diseases a1.6251.7631.214
Mental/behavioural system
    ICD F: Mental/behavioural disorders 1.336 b,c 1.124 a,d
    ICD F20: Schizophrenia 2.446 a,e 1.950 a,e
    ICD F19: multiple drug use/other psychoactive substances 3.071 a,e
    ICD F10: use of alcohol 2.071 a,f 2.267 a,e 1.358 a,e
    ATC N: Nervous system 1.659 a 1.39 a 1.351 a 1.020 b
    ATC N05: Psycholeptics 2.157 a
    ATC N06: Psychoanaleptics 2.218 a
Musculoskeletal system
    ICD M: Diseases of the musculoskeletal system 1.030 b,e 0.544 a,g
    ICD M51: Other intervertebral disc disorders 1.356 a,e
    ATC M: Musculo-skeletal system 1.170 a 1.164 a 1.129 a
    Remedial gymnastics 2.489 a 1.019 b 1.068 b
Respiratory system
    ICD J44-J46:other COPD/Asthma 0.772 a,e 1.023 b,d 1.259 a,e 1.707 a,c
2.114 a,c 1.331 a,f
    ATC J: General antiinfectives for systemic use 1.048 a
Cardiovascular system
    ICD I: Diseases of the circulatory system 0.938 b,f 0.955 b,f
0.474 a,g
    ICD I25: Chronic ischaemic heart disease 1.213 a,e 1.186 a,e
    ICD I10-I15: Hypertensive diseases 0.854 a,f
    ICD I50: Heart failure 1.134 a,d
    ATC C: Cardiovascular system 1.144 a
    ATC: C03: Diuretics 1.289 a 1.249 a
    ATC C09: Agents acting on the renin-angiotensin system 1.472 a 1.140 a
Metabolic system
    ICD E10-E14: Diabetes mellitus 2.142 a,d 1.236 a,f 1.010 b,d
    ATC A: Alimentary tract and metabolism 1.231 a 1.254 a 1.254 a 1.034 b
    Remedy: podiatry 1.551 a
Other
    ATC B: Blood and blood forming organs 1.231 a 1.148 a
    ICD N: Diseases of the genitourinary system 1.112 a,e
    ICD C: Neoplasms 1.772 a,c
Constant0.0690.1350.2610.5170.780
    Nagelkerkes R 20.4530.4600.3670.2890.211

Dependant variable is high-cost-patient in 2011 yes/no; no is reference category.

All predictors are significant with P < 0.05.

a Yes/no; no is reference category.

b Metric/number.

c Inpatient (main diagnosis).

d Overall.

e Outpatient.

f Inpatient (secondary diagnosis).

g Sickness benefit.

Table 2

Results of stepwise forward regression for women of all age groups

Age group
Predictor (2010)18–34 Odds ratio35–49 Odds ratio50–64 Odds ratio65-79 Odds ratio80+ Odds ratio
Insurance status
    Compulsorily a0.8390.619
    With dependant coverage a0.8200.6970.6690.853
    Unemployed a0.813
    Retired a1.521
    Medical or employment measurement a2.1182.3822.421
Costs
    Overall (transformed) b1.2251.2571.1581.1491.113
    Travelling a1.2651.2461.2371.2491.145
    Remedies a1.1881.175
    Therapeutic appliances a1.2171.2671.211
    Prescription drugs a1.339
    Rehabilitation a0.401
    Sickness benefit a0.484
Nursing
    Home nursing costs a2.6162.3822.436
    Nursing level a2.2981.8092.4671.8611.317
Chronic/multi-morbidity
    Different ICD-codes (3-digit) b1.0521.0131.0061.014
    Chronic diseases a1.2541.274
    Chronic diseases b0.867
Mental/behavioural system
    ICD F: Mental/behavioural disorders 1.434 a,c 1.569 a,c 1.184 a,c 1.014 a,d
0.327 a,e 1.462 a,f
    ICD F45: Somatoform disorders 0.898 a,c
    ICD F10: use of alcohol 1.824 a,f
    ICD F30-39: Mood (affective) disorders 1.192 a,d
    ATC N: Nervous system 1.600 a 1.514 a 1.338 a 1.257 a
    ATC N05: Psycholeptics 1.692 a
    ATC N06: Psychoanaleptics 1.551 a
Musculoskeletal system
    ICD M: Diseases of the musculoskeletal system 1.013 b,c
    ICD M17: Gonarthrosis 1.214 a,c
    ATC M: Musculo-skeletal system 1.264 a 1.204 a
    Remedial gymnastics 1.009 b
Respiratory system
    ICD J44-J46: other COPD/Asthma 2.485 a,f 1.220 a,c
    ATC J: General antiinfectives for systemic use 1.037 b
Cardiovascular system
    ICD I: Diseases of the circulatory system 1.004 b,c
0.934 b,g
    ICD I50: Heart failure 1.172 a,d
    ATC C: Cardiovascular system 1.023 b
    ATC C03: Diuretics 1.262 a
Metabolic system
    ICD E10-E14: Diabetes mellitus 1.795 b,c 1.333 a,d 1.029 b,d 1.015 b,d 1.021 b,d
    ICD E87: Other disorders of fluid, electrolyte and acid-base balance 0.747 a,g
    ATC A: Alimentary tract and metabolism 1.277 a 1.259 a 1.242 a 1.208 a
Other
    ICD C: Neoplasms 1.97 a,f
    ICD R: Symptoms/signs/abnormal clinical/laboratory findings 0.912 b,g
    ICD Z38: Liveborn infants according to place of birth 0.144 a,f
Constant0.1200.1030.2100.3240.690
    Nagelkerkes R 20.3310.4530.4080.3380.222
Age group
Predictor (2010)18–34 Odds ratio35–49 Odds ratio50–64 Odds ratio65-79 Odds ratio80+ Odds ratio
Insurance status
    Compulsorily a0.8390.619
    With dependant coverage a0.8200.6970.6690.853
    Unemployed a0.813
    Retired a1.521
    Medical or employment measurement a2.1182.3822.421
Costs
    Overall (transformed) b1.2251.2571.1581.1491.113
    Travelling a1.2651.2461.2371.2491.145
    Remedies a1.1881.175
    Therapeutic appliances a1.2171.2671.211
    Prescription drugs a1.339
    Rehabilitation a0.401
    Sickness benefit a0.484
Nursing
    Home nursing costs a2.6162.3822.436
    Nursing level a2.2981.8092.4671.8611.317
Chronic/multi-morbidity
    Different ICD-codes (3-digit) b1.0521.0131.0061.014
    Chronic diseases a1.2541.274
    Chronic diseases b0.867
Mental/behavioural system
    ICD F: Mental/behavioural disorders 1.434 a,c 1.569 a,c 1.184 a,c 1.014 a,d
0.327 a,e 1.462 a,f
    ICD F45: Somatoform disorders 0.898 a,c
    ICD F10: use of alcohol 1.824 a,f
    ICD F30-39: Mood (affective) disorders 1.192 a,d
    ATC N: Nervous system 1.600 a 1.514 a 1.338 a 1.257 a
    ATC N05: Psycholeptics 1.692 a
    ATC N06: Psychoanaleptics 1.551 a
Musculoskeletal system
    ICD M: Diseases of the musculoskeletal system 1.013 b,c
    ICD M17: Gonarthrosis 1.214 a,c
    ATC M: Musculo-skeletal system 1.264 a 1.204 a
    Remedial gymnastics 1.009 b
Respiratory system
    ICD J44-J46: other COPD/Asthma 2.485 a,f 1.220 a,c
    ATC J: General antiinfectives for systemic use 1.037 b
Cardiovascular system
    ICD I: Diseases of the circulatory system 1.004 b,c
0.934 b,g
    ICD I50: Heart failure 1.172 a,d
    ATC C: Cardiovascular system 1.023 b
    ATC C03: Diuretics 1.262 a
Metabolic system
    ICD E10-E14: Diabetes mellitus 1.795 b,c 1.333 a,d 1.029 b,d 1.015 b,d 1.021 b,d
    ICD E87: Other disorders of fluid, electrolyte and acid-base balance 0.747 a,g
    ATC A: Alimentary tract and metabolism 1.277 a 1.259 a 1.242 a 1.208 a
Other
    ICD C: Neoplasms 1.97 a,f
    ICD R: Symptoms/signs/abnormal clinical/laboratory findings 0.912 b,g
    ICD Z38: Liveborn infants according to place of birth 0.144 a,f
Constant0.1200.1030.2100.3240.690
    Nagelkerkes R 20.3310.4530.4080.3380.222

Dependant variable is high-cost-patient in 2011 yes/no; no is reference category.

All predictors are significant with P < 0.05.

a Yes/no; no is reference category.

b Metric/number.

c Outpatient.

d Overall.

e Sickness benefit.

f Inpatient (main diagnosis).

g Inpatient (secondary diagnosis).

Table 2

Results of stepwise forward regression for women of all age groups

Age group
Predictor (2010)18–34 Odds ratio35–49 Odds ratio50–64 Odds ratio65-79 Odds ratio80+ Odds ratio
Insurance status
    Compulsorily a0.8390.619
    With dependant coverage a0.8200.6970.6690.853
    Unemployed a0.813
    Retired a1.521
    Medical or employment measurement a2.1182.3822.421
Costs
    Overall (transformed) b1.2251.2571.1581.1491.113
    Travelling a1.2651.2461.2371.2491.145
    Remedies a1.1881.175
    Therapeutic appliances a1.2171.2671.211
    Prescription drugs a1.339
    Rehabilitation a0.401
    Sickness benefit a0.484
Nursing
    Home nursing costs a2.6162.3822.436
    Nursing level a2.2981.8092.4671.8611.317
Chronic/multi-morbidity
    Different ICD-codes (3-digit) b1.0521.0131.0061.014
    Chronic diseases a1.2541.274
    Chronic diseases b0.867
Mental/behavioural system
    ICD F: Mental/behavioural disorders 1.434 a,c 1.569 a,c 1.184 a,c 1.014 a,d
0.327 a,e 1.462 a,f
    ICD F45: Somatoform disorders 0.898 a,c
    ICD F10: use of alcohol 1.824 a,f
    ICD F30-39: Mood (affective) disorders 1.192 a,d
    ATC N: Nervous system 1.600 a 1.514 a 1.338 a 1.257 a
    ATC N05: Psycholeptics 1.692 a
    ATC N06: Psychoanaleptics 1.551 a
Musculoskeletal system
    ICD M: Diseases of the musculoskeletal system 1.013 b,c
    ICD M17: Gonarthrosis 1.214 a,c
    ATC M: Musculo-skeletal system 1.264 a 1.204 a
    Remedial gymnastics 1.009 b
Respiratory system
    ICD J44-J46: other COPD/Asthma 2.485 a,f 1.220 a,c
    ATC J: General antiinfectives for systemic use 1.037 b
Cardiovascular system
    ICD I: Diseases of the circulatory system 1.004 b,c
0.934 b,g
    ICD I50: Heart failure 1.172 a,d
    ATC C: Cardiovascular system 1.023 b
    ATC C03: Diuretics 1.262 a
Metabolic system
    ICD E10-E14: Diabetes mellitus 1.795 b,c 1.333 a,d 1.029 b,d 1.015 b,d 1.021 b,d
    ICD E87: Other disorders of fluid, electrolyte and acid-base balance 0.747 a,g
    ATC A: Alimentary tract and metabolism 1.277 a 1.259 a 1.242 a 1.208 a
Other
    ICD C: Neoplasms 1.97 a,f
    ICD R: Symptoms/signs/abnormal clinical/laboratory findings 0.912 b,g
    ICD Z38: Liveborn infants according to place of birth 0.144 a,f
Constant0.1200.1030.2100.3240.690
    Nagelkerkes R 20.3310.4530.4080.3380.222
Age group
Predictor (2010)18–34 Odds ratio35–49 Odds ratio50–64 Odds ratio65-79 Odds ratio80+ Odds ratio
Insurance status
    Compulsorily a0.8390.619
    With dependant coverage a0.8200.6970.6690.853
    Unemployed a0.813
    Retired a1.521
    Medical or employment measurement a2.1182.3822.421
Costs
    Overall (transformed) b1.2251.2571.1581.1491.113
    Travelling a1.2651.2461.2371.2491.145
    Remedies a1.1881.175
    Therapeutic appliances a1.2171.2671.211
    Prescription drugs a1.339
    Rehabilitation a0.401
    Sickness benefit a0.484
Nursing
    Home nursing costs a2.6162.3822.436
    Nursing level a2.2981.8092.4671.8611.317
Chronic/multi-morbidity
    Different ICD-codes (3-digit) b1.0521.0131.0061.014
    Chronic diseases a1.2541.274
    Chronic diseases b0.867
Mental/behavioural system
    ICD F: Mental/behavioural disorders 1.434 a,c 1.569 a,c 1.184 a,c 1.014 a,d
0.327 a,e 1.462 a,f
    ICD F45: Somatoform disorders 0.898 a,c
    ICD F10: use of alcohol 1.824 a,f
    ICD F30-39: Mood (affective) disorders 1.192 a,d
    ATC N: Nervous system 1.600 a 1.514 a 1.338 a 1.257 a
    ATC N05: Psycholeptics 1.692 a
    ATC N06: Psychoanaleptics 1.551 a
Musculoskeletal system
    ICD M: Diseases of the musculoskeletal system 1.013 b,c
    ICD M17: Gonarthrosis 1.214 a,c
    ATC M: Musculo-skeletal system 1.264 a 1.204 a
    Remedial gymnastics 1.009 b
Respiratory system
    ICD J44-J46: other COPD/Asthma 2.485 a,f 1.220 a,c
    ATC J: General antiinfectives for systemic use 1.037 b
Cardiovascular system
    ICD I: Diseases of the circulatory system 1.004 b,c
0.934 b,g
    ICD I50: Heart failure 1.172 a,d
    ATC C: Cardiovascular system 1.023 b
    ATC C03: Diuretics 1.262 a
Metabolic system
    ICD E10-E14: Diabetes mellitus 1.795 b,c 1.333 a,d 1.029 b,d 1.015 b,d 1.021 b,d
    ICD E87: Other disorders of fluid, electrolyte and acid-base balance 0.747 a,g
    ATC A: Alimentary tract and metabolism 1.277 a 1.259 a 1.242 a 1.208 a
Other
    ICD C: Neoplasms 1.97 a,f
    ICD R: Symptoms/signs/abnormal clinical/laboratory findings 0.912 b,g
    ICD Z38: Liveborn infants according to place of birth 0.144 a,f
Constant0.1200.1030.2100.3240.690
    Nagelkerkes R 20.3310.4530.4080.3380.222

Dependant variable is high-cost-patient in 2011 yes/no; no is reference category.

All predictors are significant with P < 0.05.

a Yes/no; no is reference category.

b Metric/number.

c Outpatient.

d Overall.

e Sickness benefit.

f Inpatient (main diagnosis).

g Inpatient (secondary diagnosis).

The first set contains variables describing insurance status, a useful indicator of living conditions and other social factors. For example, ‘dependent coverage’ status generally indicates that a person is either a non-working child or a non-working spouse, whereas status categories like ‘unemployed’ or ‘retired’ are non-ambiguous. Dependent coverage, a common predictor for women in almost all age groups, had a negative effect as the odds of becoming a high-cost-patient decreased with this status. However, insurance status is a useful predictor only in the young and middle age groups because all older insurees have the same status (‘retired’).

A second set contained variables for the prediction of high-cost patients based on expenditures. In this category, we summarized the metric information on overall costs in the previous year as well as several dichotomous variables on costs in different health care sectors. Overall costs and presence of travel costs were predictors for high-cost patients in all age groups of both sexes. Moreover, the costs of therapeutic appliances were important in men of all age groups.

A third set contained variables covering the topic of home nursing, which was a strong predictor among men and women in all age groups. In particular, the presence of home nursing costs in the previous year increased the odds of becoming a high-cost patient in the following year.

Chronic diseases as well as multi-morbidity, as summarized in another set of variables, were significant predictors for high health care costs in the following year. These factors were important in all age groups and for both sexes. The strongest effects were found in younger men with at least one chronic condition, regardless of type.

A more specific set of health-related variables encompassed mental and behavioural disorders. This category, containing information on diagnoses as well as prescription drugs, was more prevalent in the younger age groups. Especially among young men, these disorders were strong predictors of future high-cost patients. Schizophrenia and alcohol/drug abuse were the main factors associated with increasing odds of becoming a high-cost patient.

Another health related specific group of variables focuses on the musculoskeletal system. Apart from diagnoses and drug information, this category contains information on remedies, namely remedial gymnastics. Among young and middle-aged men, this set of variables proved to be an important indicator with back problems being the predominant issue. Among women, musculoskeletal disorders were predictors for high health care costs in middle age and older age groups, and gonarthrosis was a specific predictor.

Diseases like asthma and COPD were included in the set of respiratory system variables. This category was more relevant in the older age groups, especially among men. The odds of becoming a high-cost patient in the following year more than doubled for men aged 65–79 who were hospitalized for asthma or COPD in the previous year.

A further set of variables described cardiovascular diseases and medications, which proved to be relevant predictors for high-cost patients in the old age groups. This category was generally more important among men than women, but heart failure was a relevant diagnosis for both sexes.

The last set of variables summarizes metabolic diseases and prescription drugs. Podiatry treatment was a relevant predictor for future high health care costs among men aged 65–79. As a category, the metabolic system prevailed among all age groups and both sexes with diabetes mellitus being the main concern.

Discussion

Data source

SHI data are increasingly used for scientific analysis because they provide a comprehensive overview of health care demands. However, there are some restrictions on the use of these data. The explanatory power of the independent variables used in this analysis is limited by the lack of some potentially relevant information like self-reported health status, clinical parameters and locus of control. Other information, such as income or marital status, was only available for parts of the population, was limited to a maximum amount and/or was not-usable for analysis. Therefore, insurance status was the only available variable that could be used as a proxy for social aspects.

Moreover, the use of SHI data is restricted by several laws and rules. The principal of data minimization and defined retention and deletion periods are among the regulations to be considered. In addition, there is a considerable time lag until new data becomes available. Due to these limitations, a short period of time between predictors and outcome variables was chosen.

Nevertheless, SHI records seem to be an appropriate data source for this analysis because most of the variables needed are available in the data sets. Health insurance files include patient information that is highly relevant to this field of research and which often cannot be captured in studies concerning multi-morbid patients or persons in need of home nursing. By using these data, we successfully identified predictors for high-cost patients that could potentially allow investigators to derive starting points for prevention. These findings are in line with other studies. 21

Challenges for further research

The use of logistic regression with independent variables from a given year to predict who will be a high-cost patient the following year implies a causal relationship. However, a real cause-effect-relationship can only be discovered by longitudinal analysis. Thus, this study is intended as a first approximation, in knowledge that further research is needed to identify the effects of certain health areas. Our results suggest that some insured persons in the study population are likely to remain in permanent high-cost health conditions that cannot be addressed with prevention strategies. Future studies might exclude these long-term high-cost patients in order to gain more specific results.

Furthermore, a dichotomous definition of high-cost patients has the disadvantage that it is not possible to estimate the potential monetary savings of diseases that could be prevented. The aim of this study was to describe potentially useful starting points in terms of their occurrence among high-cost patients prior to the high-cost episode. Future studies might calculate potential savings for specific areas based on these results.

Sex- and age-specific features

The results indicate better model fit among younger age groups. This is advantageous because the impact of prevention affects longer periods of time in younger individuals. However, since older age groups dominate the high-cost group, their needs should not be neglected either.

As for gender differences, the differences in mental and behavioural disorders between men and women were particularly striking. Furthermore, the fact that insurance status was an important indicator among women implies that the living conditions of women should also be taken into account, when implementing prevention strategies.

Potential starting points for prevention

The aim of this study was to identify predictors for high-cost patients that could potentially serve as important potential starting points for prevention before these people develop expensive disease progressions. One category that probably has high potential for prevention is the metabolic system, particularly diabetes, which affects all age groups and both sexes. This result is in line with other studies that found diabetes to be an important predictor for high-cost patients. 7,12 Although diabetes alone is not likely to cause high expenses, it can serve as catalyst for many other diseases and cause long-term complications. In this case, prevention should start when diabetes is first diagnosed since further progression and complications can be avoided through life-style changes and appropriate medication. As diabetes is a significant predictor in all subgroups, prevention strategies should reach many different people to achieve the greatest possible effects. The general prevalence of type 2 diabetes is increasing, 22 but the condition is reversible in the early stages and is easy to treat and control even after it has progressed. However, the consequences of continued progression are potentially fatal, affecting not only patient’s health, but also health economics. Blindness, kidney failure, amputation of the lower limbs, heart disease and stroke are some of the most severe long-term complications of this metabolic disease.

Another set of variables which could be a suitable starting point for prevention encompasses mental and behavioural disorders. These target groups are smaller as mostly young men and women are affected. Due to the more heterogeneous composition of this population, different prevention strategies may be required. Nevertheless, one thing the mental and behavioural disorders have in common is that they are mostly long-term diseases that lead to high costs if not properly treated.

The prevention potential of the predictor sets for home nursing and cardiovascular, respiratory and musculoskeletal disorders is probably low as these diseases tend to have already progressed to a stage where it is too late to intervene preventively, where the damage to the patient’s health is irreversible. Nevertheless, it may still be beneficial to intervene in individuals with such progressed conditions and to optimize health care processes in order to reduce costs in the future.

Finally, the set of variables for insurance status gives no indications relevant to prevention but might be useful for planning the implementation of prevention strategies. Our findings suggest that the importance and contents of most of the investigated categories differ between different age groups and sexes. Therefore, it is unlikely that a ‘one-size-fits-all’ prevention strategy can be developed based on this approach. Instead, different approaches are necessary to address individual needs properly. Information on social factors like marital or working status can help to identify different demands and to specify the right target groups as these variables influence health and affect health care behaviour.

Acknowledgements

This study was implemented in cooperation between Hannover Medical School and AOK Lower Saxony.

Funding

This study was supported by AOK Lower Saxony.

Conflicts of interest : None declared.

Key points

  • If not detected and treated in early stages, chronic diseases are likely to progress and cause high health care costs in the future.

  • It is possible to identify predictors for high-cost patients that focus on specific diseases and, therefore, could be addressed with prevention strategies.

  • Diabetes mellitus is one of the best starting points for prevention as many patients are affected.

References

1

Felder
S
.
Health care expenditures and the aging population
.
Bundesgesundheitsblatt–Gesundheitsforschung–Gesundheitsschutz
2012
;
55
:
614
23
.

2

Kuo
RN
Lai
M-S
.
The influence of socio-economic status and multimorbidity patterns on healthcare costs: a six-year follow-up under a universal healthcare system
.
Int J Equity Health
2013
;
12
:
69
.

3

Porter
ME
Guth
C
.
Redefining German Health Care
.
Berlin, Heidelberg
:
Springer Berlin Heidelberg
,
2012
.

4

Amelung
VE
.
Managed Care. Neue Wege im Gesundheitsmanagement. (Managed Care. New Paths in Health Mangagement.)
, 5th rev. edn
Wiesbaden
:
Gabler
,
2012
.

5

GKV-Spitzenverband, Kennzahlen der gesetzlichen Krankenversicherung. (Key Figures for the Health Insurance Division), Germany. Available at: https://www.gkv-spitzenverband.de/media/grafiken/gkv_kennzahlen/kennzahlen_gkv_2014_q4/300dpi_6/GKV-Kennzahlen_VersichertejeSystem_absolut_03-2015_300.jpg (27 August 2015, date last accessed).

6

Gmünder
Ersatzkasse
, editor.
GEK-Gesundheitsreport 2003: Auswertungen der GEK-Gesundheitsberichterstattung. (GEK Health Report 2003. Analyses from GEK Health Reporting)
.
St. Augustin
:
Asgard
,
2003
.

7

Etemad
LR
McCollam
PL
.
Predictors of high-cost managed care patients with acute coronary syndrome
.
Curr Med Res Opin
2005
;
21
:
1977
84
.

8

Leung
MKW
Tsui
WWS
Chu
DWS
.
Survey on frequent attenders: a study to analyze the associations between frequency of attendance and chronic illness and socio-economic factors in an outpatient clinic
.
Hong Kong Practitioner
2007
;
29
:
S 189
98
.

9

Rennemark
M
Holst
G
Fagerstrom
C
Halling
A
.
Factors related to frequent usage of the primary healthcare services in old age: findings from the Swedish national study on aging and care
.
Health Soc Care Commun
2009
;
17
:
304
11
.

10

Kersnik
J
Švab
I
Vegnuti
M
.
Frequent attenders in general practice: quality of life, patient satisfaction, use of medical services and GP characteristics
.
Scand J Primary Health Care
2001
;
19
:
174
7
.

11

Jyväsjärvi
S
Joukamaa
M
Väisänen
E
, et al. .
Somatizing frequent attenders in primary health care
.
J Psychosom Res
2001
;
50
:
185
92
.

12

Ash
AS
Zhao
Y
Ellis
RP
Schlein Kramer
M
.
Finding future high-cost cases: comparing prior cost versus diagnosis-based methods
.
Health Serv Res
2001
;
36
:
194
206
.

13

Fleishman
JA
Cohen
JW
.
Using information on clinical conditions to predict high-cost patients
.
Health Serv Res
2010
;
45
:
532
52
.

14

Vedsted
P
Christensen
MB
.
Frequent attenders in general practice care: a literature review with special reference to methodological considerations
.
Public Health
2005
;
119
:
118
37
.

15

Smits
FT
Mohrs
JJ
Beem
EE
, et al. .
Defining frequent attendance in general practice
.
BMC Fam Pract
2008
;
9
14
.

16

Bergh
H
Baigi
A
Fridlund
B
Marklund
B
.
Life events, social support and sense of coherence among frequent attenders in primary health care
.
Public Health
2006
;
120
:
229
36
.

17

ICD-10-GM: Deutschen Institut für Medizinische Dokumentation und Information (DIMDI), Germany. Available at: http://www.dimdi.de/static/de/klassi/icd-10-gm/index.htm (9 September 2012, date last accessed).

18

Deutschen Institut für Medizinische Dokumentation und Information (DIMDI) (editor). Anatomisch-therapeutisch-chemische-Klassifikation mit Tagesdosen. Amtliche Fassung des ATC-Index mit DDD-Angaben für Deutschland im Jahre 2010. Köln: DIMDI,

2010
.

19

Backhaus
K
Erichson
B
Plinke
W
Weiber
R
.
Multivariate analysemethoden: eine anwendungsorientierte einführung.(Multivariat analysis methods: an application-oriented introduction.)
, 11th Rev. edn
Berlin, Heidelberg
:
Springer
,
2006
.

20

Hartmann J. Entwicklung eines Modells zur Prädiktion von Hochnutzern des Folgejahres unter Zuhilfenahme von Routinedaten der GKV am Beispiel von Daten der AOK Niedersachsen [dissertation]. (Development of a Prediction Model to Identify High-Cost Patients of the Subsequent Year Using Claims Data of the Statutory Health Insurance by the Example of Data from AOK Lower Saxony.). Hannover: Medizinische Hochschule,

2014
.

21

OECD, Health at a Glance: Europe, 2012. OECD Publishing 2012. Available at:

(10 November 2015, date last accessed).

22

Li
R
Zhang
P
Barker
LE
, et al. .
Cost-effectiveness of interventions to prevent and control diabetes mellitus: a systematic review
.
Diabetes Care
2010
;
33
:
1872
94
.

Comments

0 Comments
Submit a comment
You have entered an invalid code
Thank you for submitting a comment on this article. Your comment will be reviewed and published at the journal's discretion. Please check for further notifications by email.