Background

Depressive disorders are the third leading cause of disability in Europe [1] and lead to considerable costs from treatment, productivity loss, accidents, and suicides [2]. To understand the economic impact of depression and to inform health policy, cost-of-illness studies are important [3]. In patients starting treatment for depression, long-term cost analyses are particularly relevant, since the intervention costs may peak in the first year, while cost reductions secondary to clinical improvement may not occur until subsequent years [4]. Patients responding to depression treatment have lower costs in the first 6–12 months, compared to non-responding patients [5], but there are limited data on this phenomenon beyond 12-month follow-up [6, 7], and no intervention has been shown to reduce costs significantly compared to a control group beyond 12 months [711].

Due to the high prevalence of depression, it is also important to assess depression costs in different types of patients and settings. Up to one-half of patients with self-reported [12] or physician-diagnosed [13] depression use complementary therapies, but only a few depression cost analyses have included costs of complementary therapies [1416].

An opportunity to study the long-term costs of complementary and conventional treatments for depression as well as indirect costs was offered by the Anthroposophic Medicine Outcomes Study (AMOS) [17]. AMOS was a prospective cohort study of outpatients with various chronic indications treated by physicians providing anthroposophic medicine (AM—a complementary therapy system) in addition to conventional care. The study was conducted in conjunction with a health benefit program. An analysis of patients treated for depressive disorder showed long-term improvement of symptoms and health status [18].

Anthroposophic treatment for depression differs from ordinary treatment in its use of non-verbal artistic and physical therapies and special AM medications [18]. Similar to recent guideline recommendations [19], conventional antidepressant drugs are not used as initial therapy for mild depression. In cases of severe depression, however, AM therapies are often combined with antidepressants [20]. Similar to conventional care, AM physicians may offer counselling and refer patients for psychotherapy or, if necessary, for inpatient treatment [18].

One research question of the AMOS study concerned the cost-of-illness of depression and other chronic diseases [21]. Here we present a secondary cost analysis from AMOS, contrasting patients treated for depression, patients with depressive symptoms treated for another disorder, and patients without depressive symptoms.

Methods

Objective and design

The objective was to study long-term depression costs in a German AM outpatient setting from the societal perspective. For this purpose, we analysed healthcare use and productivity loss in a prospective cohort of patients starting AM therapies (AMOS), and calculated direct and indirect costs. The cost analysis had two main objectives:

  1. 1.

    Comparison of patients treated for depression, patients with depressive symptoms treated for another disorder, and patients without depressive symptoms.

  2. 2.

    Comparison of the pre-study year to the 1st and 2nd years after enrolment.

Setting, participants, and therapy

Participating physicians were certified by the Physicians’ Association for Anthroposophical Medicine in Germany and had office-based practices or worked in outpatient clinics in Germany. The physicians recruited consecutive patients starting AM therapy. Patients enrolled in the period 1 January 1999–31 March 2001 were included in the present analysis if they fulfilled the eligibility criteria. Inclusion criteria for all patients in this analysis were:

  • age 17–70 years;

  • starting AM therapy for a disorder of minimum 6 months’ duration (main disorder):

    • referral to non-medical therapists (providing AM art therapy involving painting, drawing, clay modelling, music or speech exercises; or AM eurythmy movement exercise therapy; or AM rhythmical massage therapy);

    • or starting AM therapy provided by study physician (AM-related consultations, AM medication) after an initial AM-related consultation ≥30 min.

Further inclusion criteria applied to the three groups studied:

Patients treated for depression (Group 1)

  • Center for Epidemiological Studies Depression Scale, German version (CES-D) ≥24 points, which is the cut-off point for clinically relevant depressive symptoms [22, 23];

  • physician’s diagnosis of main disorder as depression or depressive symptoms;

  • depressed mood plus at least two of the following depressive core symptoms [symptoms of dysthymic disorder, according to Diagnostic and Statistical Manual of Mental Disorders, 4th Edition (DSM-IV)]: poor appetite or overeating, insomnia or hypersomnia, low energy or fatigue, low self-esteem, poor concentration or difficulty making decisions, feelings of hopelessness [24].

Patients with depressive symptoms, treated for another disorder (Group 2)

  • CES-D ≥ 24 points;

  • physician’s diagnosis of main disorder is NOT depression or depressive symptoms.

Patients without depressive symptoms (Group 3)

  • CES-D < 24 points.

The patients were treated according to the physician’s discretion.

Outcome measures

The primary outcome was overall health costs, regardless of diagnosis, in the pre-study year and in the 1st and 2nd study years. Costs included direct health costs (AM therapies, non-AM complementary therapies, physician and dentist visits, psychotherapy, medication, physiotherapy, occupational therapy, inpatient hospital and rehabilitation treatment) and indirect costs (sick-leave compensation, early retirement, mortality) and were assessed from a societal perspective.

Depressive symptoms were assessed by the CES-D, ranging from 0 (“no depressive symptoms”) to 60 (“maximum symptoms”). Patients document the frequency of 20 symptoms during the last week, from 0 (“rarely or none of the time ≈ less than 1 day”) to 3 (“most or all of the time ≈ 5–7 days”) [22, 23].

Baseline health status was assessed with the SF-36 Physical Component summary score, the SF-36 Health Change item [25], disease duration, and disease severity. Disease severity was assessed by the physicians on a numerical rating scale [26] from 0 (“not present”) to 10 (“worst possible”).

Medical comorbidity was assessed with the Chronic Disease Score [27]. The Chronic Disease Score is based on medications used to treat chronic diseases, is a good predictor of future hospitalisations and health costs [28], and has been used in a number of cost analyses of depression [6, 2931]. We used the original version [27] with 17 medication classes (updated for newer medications [32, 33]), which does not include psychotropic medications and is not associated with depression or anxiety [27]. The Chronic Disease Score has a theoretical range of 0–35 points, the upper range requiring intake of 20 different medications. AMOS study documentation was truncated at eight medications per patient and observation period, resulting in a range of 0–17 points.

Resource utilisation included AM therapies, medication, physician and dentist visits, diagnostic investigations, psychotherapy, physiotherapy, occupational therapy, and inpatient hospital and rehabilitation treatment.

Productivity loss included sick-leave, early retirement, and deaths regardless of the cause of death.

Data collection

All data were documented with questionnaires returned in sealed envelopes to the study office. Physicians documented name, duration and severity of main disorder, comorbid disorders, and (Group 1 only) depressive core symptoms; all other items were documented by patients. Patient responses were not made available to physicians. Resource use and productivity loss in the pre-study year were documented at study enrolment; in the 1st and 2nd study years these items were documented after 3 months (medication only), and after 6, 12, 18, and 24 months. Physicians were compensated €40 per included and fully documented patient, while patients received no compensation.

Data were entered twice by two different individuals into Microsoft® Access 97. The two datasets were compared and discrepancies resolved by checking with the original data.

Quality assurance, adherence to regulations

The study was approved by the Ethics Committee of the Faculty of Medicine Charité, Humboldt University Berlin, and was conducted according to the Declaration of Helsinki and International Conference on Harmonisation Good Clinical Practice guidelines. Written informed consent was obtained from all patients before enrolment.

Data analysis

Data analysis [PASW® Statistics 17.0 (http://www.spss.com/statistics/), StatExact® 5.0.3 (http://www.cytel.com/products/statxact/), S-PLUS® 8.0 (http://www.insightful.com/products/splus/)] was performed on all patients fulfilling the eligibility criteria.

Resource use and unit costs (Table 1)

Costs were analysed from a societal perspective. For all resource items except non-AM complementary therapies, costs were calculated by multiplying resource use by the unit cost for the respective item. Out-of-pocket expenditures for non-AM complementary therapies were documented directly.

Table 1 Cost calculation

Unit costs (Table 1) were calculated from average costs in Germany for the year 2000 (physicians’ and dentists’ fees, medication, hospital, rehabilitation, sick-leave costs [34, 35]) or from reimbursement fees regulated in healthcare benefit catalogues (AM therapies, diagnostic investigations, psychotherapy, physiotherapy, occupational therapy [3638]).

Hospital costs were calculated from average costs in each German federal state [39]. Physicians’ fees were calculated from average fees of general practitioners plus 12 specialist categories in the Accounting Data Record Panel of the Central Research Institute of Ambulatory Health Care in Germany [40]. Costs for diagnostic investigations (X-rays, computer tomography scans, nuclear magnetic resonance imaging and scintigrams) were calculated separately [36]. Costs of AM medication (any medication produced by the pharmaceutical companies Abnoba Arzneimittel, Pforzheim, Germany; Helixor Heilmittel, Rosenfeld Germany; Wala Heilmittel, Bad Boll/Eckwälden, Germany; and Weleda, Arlesheim, Switzerland) were calculated from average costs in 51 different price groups. Costs of other medications were calculated from national average costs in 86 Anatomical Therapeutic Chemical Classification Index subgroups [41].

Costs of productivity loss included costs for sick-leave, early retirement, and deaths. Costs for sick-leave and early retirement were calculated from national average gender-specific earnings for civil servants, salaried employees, and wage earners (100% compensation for days 1–42, 70% compensation thereafter) [35]. Mortality costs (4.3% of all costs of productivity loss) were calculated with the same method.

Unless otherwise stated, costs were not discounted.

Analysis of missing data

Missing data for resource items and for expenditures for non-AM complementary therapies consisted of: (1) missing data for items in available questionnaires (0.5% of items), and (2) missing follow-up questionnaires (14.7% of administered questionnaires). Missing data for (1) were replaced by the group mean value for the respective item and follow-up period. Missing data for (2) were replaced with the last value carried forward—the last value referring to the preceding 6-month documentation period (missing values for the 0–6-month follow-up period were replaced with 50% of the corresponding value in the pre-study year). In an alternative analysis, missing data for (2) were replaced using multiple imputations [42]; results differed little from the main analysis, with maximum differences in unadjusted total costs of 5, 5, and 7% in Groups 1, 2, and 3, respectively.

Missing data also occurred for two variables used in the adjusted costs analyses: missing values for baseline disease severity (physician documentation: n = 6 values missing in 487 patients) were replaced by the respective values for a corresponding outcome documented by patients (Symptom Score [17]). Missing values for baseline SF-36 scores (n = 6 values missing out of 3,896 values) were replaced by the mean value in the respective patient group (1, 2 or 3).

Unadjusted analyses

For total costs, bootstrap means with bias-corrected and accelerated (BCa) bootstrap 95% confidence intervals (95% CI) were calculated, using 2,000 replications per analysis [43]. For all other unadjusted analyses, two independent samples were compared with Mann–Whitney-U-Test and Fisher’s Exact Test for continuous and dichotomous data, respectively; three independent samples were compared with the Kruskal–Wallis test and the Cochran–Armitage trend test. Median differences with 95% CI were estimated according to Hodges and Lehmann [44]. All unadjusted analyses were considered explorative, with significance criteria P < 0.05 and 95% CI not including 0.

Adjusted analyses

Total costs in the three groups and 3 years were adjusted for socio-demographic characteristics, comorbidity, and baseline health status. Since the three groups were defined by their depression status, we did not adjust for depressive symptoms (baseline CES-D) and included only those independent variables that showed low or minimal correlation to baseline CES-D, defined as Spearman-Rho < 0.3. The following 11 variables were included in the main analysis: patient group (1, 2 or 3), age, gender, education, engagement in economic activity, living in the former German Democratic Republic, disease duration, Chronic Disease Score, baseline disease severity, baseline SF-36 Physical Component, and baseline SF-36 Health Change.

In an alternative analysis, stricter criteria for inclusion of independent variables were used, with a correlation to baseline CES-D <0.2 instead of <0.3. This lead to exclusion of 1 of the 11 variables (SF-36 Health change) from the models, although the overall results were very similar.

In another alternative analysis, the impact of a 12th variable, household income, was explored. This variable was available for only 84% of patients, was not a significant predictor in any models, and was not included in the main analysis.

The following adjustment models were used:

  • In a preliminary analysis, general linear models with repeated measures were used, with the natural logarithm of total costs as dependent variable and study year as within-subject variable. Results were similar to results of the other adjusted analyses but are of lesser utility due to the use of the logarithm of costs [45]. Results are therefore not shown.

  • For the main analysis, generalised linear modelling was used to build three models with total costs in the pre-study year, 1st, and 2nd years, respectively, as dependent variable [4648]. Since costs were highly skewed, a gamma distribution with a log link was used. The influence of the 11 independent variables on the costs was analysed in a main effects model, using a robust covariance matrix, estimation of scale parameters by the maximum likelihood method, testing for model effects and between-group differences with Wald chi-square statistics, and estimation of marginal means with Wald type 95% CI. (An alternative model included the interaction of age and gender, which was a significant predictor in the pre-study year only. Results were otherwise changed only minimally, and this interaction was not used in the main analysis). Multiple comparisons between Groups 1, 2 and 3 were corrected with the sequential Bonferroni method. For all 11 variables in the models the variance inflation factors were <1.2, suggesting that multicollinearity was not a problem. The model for the 1st year showed a good fit with a ratio deviance/df of 1.024, while the models for the pre-study year and 2nd year showed signs of some overdispersion with ratios deviance/df of 1.387 and 1.685, respectively. Scatter plots of standardised deviance residuals, Pearson residuals, and Likelihood residuals vs predicted mean revealed 2–4 outliers for each model. According to Cook’s distance and leverage values, 0–2 of these outliers were potentially influential for each model, respectively. Data inspection showed no evidence of documentation errors of the cost values for the identified outliers. Moreover, the exclusion of outliers from the models led to only minimal changes (with the exception of one change in the 1st year; see Results section). Hence all outliers were retained in the main analysis.

  • A supplementary fourth model was built using generalised estimating equations with average total costs across the 3 years as dependent variable, study year as within-subject variable, and model specifications analogous to the three generalised linear models described above.

Sensitivity analyses

Two sensitivity analyses were performed, for the following reasons:

  1. 1.

    A previous sensitivity analysis of cost units in the AMOS study [17] had identified one item for which uncertainty in the assumptions could have a relevant impact on costs: inpatient hospital treatment. Hospital costs will vary according to hospital specialisation, which was not documented in the study; therefore average costs across hospital specialties were used. Costs in German psychiatric and neurological hospitals are lower (average €203 per day) than in general hospitals (€309). Accordingly, hospital costs in depressed patients (Groups 1 and 2) were analysed under the extreme assumption that 100% of hospital days were spent in psychiatric or neurological hospitals (instead of 7%—the German average) [35, 49].

  2. 2.

    The inclusion criteria for Group 1 required fulfilment of all depression criteria, which can be considered a narrow definition, as AM physicians may treat patients for depressive symptoms regardless of diagnostic criteria. Accordingly, total costs were analysed using less restrictive criteria for Group 1, including all patients aged 17–70 years starting therapy for depressive symptoms.

Results

Participating physicians

A total of 104 AM-certified physicians enrolled patients into the study; these physicians did not differ significantly from AM-certified physicians in Germany not enrolling patients (n = 258) regarding age (mean ± SD: 45.9 ± 7.0 vs 48.3 ± 8.1 years), gender (54.8 vs 65.1% males), number of years in practice (18.0 ± 7.4 vs 20.2 ± 9.2 years) or the proportion of physicians working in primary care (86.5 vs 84.4%).

Patient recruitment and follow-up

From 1 January 1999 to 31 March 2001, a total of 783 patients aged 17–70 years starting AM therapy were screened for inclusion. Of these, 487 patients were included in the present analysis (Fig. 1). Of the 296 patients who were not included, 206 were potentially eligible. These potentially eligible patients did not differ from included patients in terms of gender, disease duration or baseline disease severity. The potentially eligible patients not included were an average of 6.3 years (95% CI 3.5–9.2 years) younger than included patients (P < 0.001).

Fig. 1
figure 1

Patient recruitment. CES-D Center for Epidemiological Studies Depression Scale, German version

A total of 81.9% (n = 399/487) of patients were enrolled by general practitioners, 6.8% by internists, 5.7% by gynaecologists, 3.1% by psychiatrists, and 2.5% by other specialists. The physicians’ settings were primary care practices (87.5% of patients, n = 426/487), referral practices (8.4%), and outpatient clinics (4.1%).

The last patient follow-up ensued on 30 April 2003. A total of 97.5% (n = 475/487) of patients returned at least one follow-up questionnaire; return rates were 93.2, 91.4, 86.2, 80.1, and 75.4% after 3, 6, 12, 18, and 24 months, respectively. The 487 patients were administered a total of 2,435 follow-up questionnaires, of which 2,076 (85.3%) questionnaires were returned. Return rates did not differ significantly between Groups 1, 2, and 3 at any follow-up point. Respondents and non-respondents of the 24-month follow-up questionnaire did not differ significantly regarding age, gender, disease duration, baseline CES-D or total health costs in the pre-study year.

Baseline characteristics

Patients were recruited from 15 of 16 German federal states. The three groups did not differ significantly regarding demographics, disease duration or comorbidity. The groups differed regarding disease severity, SF-36 Physical Component, SF-36 Health Change, and, by nature of the inclusion criteria, by the main diagnosis and baseline CES-D (Table 2).

Table 2 Baseline characteristics

In Group 1, the duration of the depressive disorder was 6–11 months in 9% (n = 7/81) of patients, 12–23 months in 9%, and ≥2 years in 83%. Of the patients in Group 1, 23% (n = 19/81) had a history of inpatient psychiatric treatment.

Change in depressive symptoms

From baseline to 24-month follow-up, CES-D improved by median 15.9 points (95% CI 12.5–19.0, P < 0.001) in Group 1, by 12.8 points (95% CI 9.5–16.0, P < 0.001) in Group 2, and by 3.0 points (95% CI 2.0–4.0, P < 0.001) in Group 3.

Resource use and productivity loss

Psychotherapy use and the number of visits to physicians and dentists differed significantly between the three groups at all time periods, with lower use in Group 3. Three items differed significantly between the three groups in one time period each (see Table 3), while the remaining items did not differ between the groups in any period.

Table 3 Health resource use and productivity loss. Values are mean ± standard deviation

In all three groups, AM therapies and AM medication use increased significantly in the 1st year compared to the pre-study year (P < 0.001 for all comparisons); AM therapies also increased in the 2nd year in all three groups. These increases are a consequence of the study inclusion criterion of patients starting AM treatment. Compared to the pre-study year, the number of inpatient hospital days decreased significantly in the 1st year in Group 3 (P = 0.007) and in the 2nd year in Group 1 (P = 0.024), the number of inpatient rehabilitation days was decreased in the 1st (P = 0.016) and 2nd years (P = 0.003) in Group 1, the number of diagnostic investigations was reduced in the 1st year in Group 2 (P = 0.002), and the number of patients with early retirement was increased in the 2nd year in Group 3 (P = 0.008). No other significant pre-post changes occurred in any group or time period.

Costs

Unadjusted analyses

Unadjusted total costs and costs for all analysed items are presented in Table 4. In the pre-study year, costs in Group 1 (bootstrap mean €6,739) were 41% higher than in Group 2 (€4,763) and 88% higher than in group 3 (€3,588). In each group (1, 2, and 3), total costs in the pre-study year were compared to total costs in the 1st and 2nd years, respectively. One significant change was observed: In the 2nd year, costs were reduced significantly from the pre-study year in Group 1 (bootstrap mean difference €1,808; 95% CI €1,110–€4,858). No other significant pre-post changes were observed in any group or period. The cost reduction in the second year in Group 1 could be attributed largely to a reduction in inpatient hospitalisation (from €3,006 to €1,015) and long-term sick-leave (from €1,038 to  €442).

Table 4 Unadjusted health costs (€). Values are mean ± standard deviation

Among the cost items analysed, hospital costs and sick-leave costs were the largest contributors to total costs in all groups and periods, together amounting to an average of 51% of total costs across the 3 years. Hospital costs amounted to an average of 26% of total costs (range: 14% of costs in Group 2 in the 1st study year to 45% of costs in Group 1 in the pre-study year) and sick-leave costs amounted to an average of 25% (range: 22% of costs in Groups 1 and 3 in the 1st year to 33% of costs in Group 2 in the pre-study year). Costs for conventional outpatient treatment (physician and dentist visits, psychotherapy, medication, physiotherapy, occupational therapy) amounted to 25% of all costs, while costs for complementary treatment amounted to 11% (AM therapies: 7%, non-AM complementary therapies: 4%).

Cost distribution was highly skewed in all groups and periods; in Group 1, the four most “expensive” patients in each period (5% of 81 patients) caused 34, 41, and 32% of costs in the pre-study year, 1st, and 2nd years, respectively.

In depressed patients (Groups 1 + 2), the correlation between baseline CES-D and total costs in the pre-study year and 1st year, respectively, was calculated. Correlations were low (Spearman-Rho 0.11 or lower) and not significant (P = 0.151 or higher).

Adjusted analyses

The 11 independent variables were analysed in four separate models for the pre-study year, 1st year, 2nd year, and across all 3 years, respectively (Table 5). Two variables (disease severity, SF-36 Physical Component) were significant predictors for costs in all four models—higher costs being predicted by higher disease severity and by lower scores on the SF-36 Physical Component. One variable (patient group) was significant in all models except the 2nd year. Four variables (SF-36 Health Change, Chronic Disease Score, gender, engagement in economic activity) were significant in the pre-study year only. The remaining four variables (living in the former German Democratic Republic, disease duration, age, education) were not significant predictors in any model. Removing these four variables from all models—or removing all non-significant variables in each model—affected results only minimally, thus all 11 variables were retained in all four models. Estimated marginal means for total costs are presented in Table 6 and Fig. 2:

Table 5 Adjusted analyses: model effects. GZLM Generalised linear models, GEE generalised estimating equations
  • In the pre-study year, costs in Group 1 (estimated marginal mean €7,129) were 63% higher than in Group 2 (€4,371) and 102% higher than in group 3 (€3,532). Cost differences in the pre-study year were significant between Groups 1 and 2 (mean difference €2,758, 95% CI €67 to €5,449, P = 0.043) and Groups 1 and 3 (mean difference €3,597, 95% CI €775 to €6,490, P = 0.007), but not between Groups 2 and 3 (P = 0.180).

  • In the 1st study year, costs in Group 1 (€6,029) were 71% higher than in Group 2 (€3,522) and 80% higher than in Group 3 (€3,353). No between-group differences were significant in the 1st year, although group assignment was a significant predictor (P = 0.019). However, when two potentially influential outliers were removed from the model of the 1st year, cost differences between Groups 1 and 3 became significant (P = 0.032).

  • In the 2nd study year, costs in Group 1 (€4,929) were 30% higher than in Group 2 (€3,792) and 22% higher than in Group 3 (€4,031). No between-group differences were significant in the 2nd year.

  • Costs across all 3 years differed significantly between Groups 1 and 2 (P = 0.027) and 1 and 3 (P = 0.021), but not between Groups 2 and 3 (P = 0.669; Table 6).

Table 6 Adjusted analyses: estimated marginal means for total costs (€). P-values are based on Wald chi-square statistics and indicate if between-group differences are significant
Fig. 2
figure 2

Adjusted total health costs per patient and year. Estimated marginal means and 95% confidence intervals (CI) after adjustment for socio-demographic characteristics, comorbidity and baseline health status (see text for details). Group 1: patients treated for depression; Group 2: patients treated for another disorder, with depressive symptoms; Group 3: patients treated for another disorder, without depressive symptoms

Since costs in Groups 2 and 3 did not differ significantly in any year, the analyses were repeated comparing Group 1 to Groups 2 + 3 (Table 6). The resulting models were very similar to the main models, with significant between-group differences in the pre-study year (P = 0.003), 1st year (P = 0.027), and across all 3 years (P = 0.007), but not in the 2nd year (P = 0.242).

Compared to unadjusted total costs, adjusted costs were lower in all years in Groups 2 and 3 (range from 2% lower in Group 3 in the pre-study year to 25% lower in Group 2 in the 1st year), while adjusted costs in Group 1 were 6% higher, 12% lower, and 0% higher than unadjusted costs in the pre-study year, 1st, and 2nd years, respectively.

Sensitivity analyses

In the first sensitivity analysis, unadjusted hospital costs in Groups 1 and 2 were calculated on the basis of average costs of neurological and psychiatric hospitals in Germany, instead of average costs for all hospitals. Compared to the main analysis, total costs in Group 1 were reduced by 13% (€901), 12% (€798), and 7% (€321) in the pre-study year, 1st, and 2nd study years, respectively; while costs in Group 2 were reduced by 11% (€533), 5% (€228), and 6% (€274), respectively (P = 0.002 or lower for all comparisons).

In the second analysis, the eligibility criteria for Group 1 (main analysis: n = 81) were widened to include all AMOS patients aged 17–70 years starting therapy for depressive symptoms (n = 133, of which n = 36 patients were from Group 3). Compared to the main analysis, unadjusted total costs in Group 1 were reduced by 8% (€554, P = 0.268) and 13% (€903, P = 0.617) in the pre-study year and 1st year, respectively, and increased by 4% (€216) in the 2nd year (P = 0.860).

Comparison to another primary care depression sample in Germany

Unadjusted healthcare costs in Group 1 (90% primary care) were compared to corresponding costs in a German primary care patient cohort starting treatment-as-usual for a depressive disorder [50]. For the purposes of comparability, the comparison was restricted to cost items that were analysed in both studies (outpatient treatment, medication, inpatient hospital treatment, and rehabilitation; in Group 1 also AM therapies and medications). The comparison shows a similar order of magnitude and distribution of costs in both patient groups (Fig. 3).

Fig. 3
figure 3

Healthcare costs in the 1st year in Group 1 and in a subgroup of the Salize Study [50]. Group 1: Patients treated for depression (n = 81). Costs of 1st study year (2000) were discounted by +5%. Salize subgroup: Primary care patients (n = 89), costs in year 2001. Outpatient treatment: physician visits, diagnostic examinations, psychotherapy, physiotherapy, occupational therapy; in Group 1 also anthroposophic therapies and medications

Discussion

Major findings

We have analysed costs of healthcare and productivity loss in three consecutive years in 487 German adult outpatients starting complementary treatment (AM) for chronic disease. In the year preceding study enrolment, costs in patients treated for depressive disorder (Group 1) were twice as high as costs in non-depressed patients (Group 3), and 63% higher than in patients with depressive symptoms treated for another disorder (Group 2). Compared to the pre-study year, costs were significantly decreased in the second study year in Group 1, while costs in Groups 2 and 3 showed little change. An average of half of total costs were caused by inpatient hospitalisation and sick-leave, while conventional and complementary outpatient treatment amounted to 25 and 11% of total costs, respectively.

Strengths and limitations

To our knowledge, this is the first cost analysis comparing primary care patients treated for depression with depressed patients treated for another disorder and with non-depressed patients. It is also one of very few depression analyses providing annual cost data for 2 or 3 consecutive years [6, 10, 51]. Further strengths of this study include high follow-up rates and the wide range of cost domains assessed. The healthcare cost domains analysed in this study amount to 87% of healthcare expenditures of the German statutory health system: physician and dentist services, psychotherapy, physiotherapy, occupational therapy, medication, inpatient treatment, sick-leave compensation [34] as well as out-of-pocket expenditures for complementary therapies. Healthcare costs for dentures, medical appliances, nursing, patient transport, and health prevention programs (13% of expenditures in Germany) were not documented in the study and could not be analysed. Our analysis of indirect costs included sick-leave costs, costs from early retirement, and deaths, while reduced productivity on the workplace was not documented and could not be analysed. Also not analysed were intangible costs from suffering and from reduced quality of life in patients or relatives. Intangible costs of depression are difficult to quantify, and were not assessed in any studies included in a recent systematic review of cost-of-illness studies of depression [3]. Since depressive symptoms (and quality of life, data not shown) improved during the 2-year follow-up in all groups, intangible costs would be expected to decrease, not to increase. Therefore, the inclusion of intangible costs would not have changed the direction of cost reduction observed in Group 1.

We did not attempt to separate depression-related treatment costs from other treatment costs. Due to the high level of comorbidity in depression, our approach of analysing total healthcare costs may be more appropriate and has been used more extensively than analyses of depression-related costs only [3].

In the main analyses, costs were not discounted. Since all costs except costs of non-AM complementary therapies were calculated by multiplying resource use with unit costs, the cost differences from the pre-study year in the 1st and 2nd years in each group reflect differences in resource use and productivity loss only, and not inflation. Costs of non-AM complementary therapies were documented directly and were also not discounted. However, since these costs amounted to only 4% of total costs, discounting would have had only minimal effects on the total costs.

Group 1 in this analysis consisted of adult patients treated for depression of at least 6 months’ duration. Patients were recruited by physicians offering routine care, and structured psychiatric interviews to assess all criteria for depression according to DSM-IV or ICD-10 (International Statistical Classification of Diseases and Related Health Problems, 10th Revision) were not feasible, which limits diagnostic comparability with other studies. However, all patients in Group 1 fulfilled the DSM-IV core symptom criteria for dysthymic disorder and 83% of patients fulfilled the additional criterion of at least 2 years symptom duration.

Since the study had a long recruitment period, the study physicians were not able to participate throughout the period and to screen and enrol all eligible patients (criteria: see Methods). For a different subset of patients from the AMOS project (patients referred to AM therapies for any chronic indication), it was estimated that physicians enrolled every fourth eligible patient [18]. This selection could affect results if physicians were able to predict therapy response and if they preferentially screened and enrolled such patients for whom they expected a particularly favourable outcome. In this case one would expect the degree of selection (the proportion of eligible vs enrolled patients) to correlate positively with clinical outcomes. That was not the case, the correlation was almost zero (−0.04). This analysis [18] does not suggest that physicians’ screening of eligible patients was affected by selection bias. Nevertheless, selection bias affecting costs cannot be ruled out.

Another issue is the validity of the methods used to discriminate between the three groups: Groups 1 and 2 differed in the physician’s decision to start treatment for depression vs another disorder. From the physician’s perspective this distinction is highly relevant, and it has been used in a number of depression cost analyses [15, 29, 5257]. In addition, all Group 1 patients fulfilled core criteria for DSM-IV dysthymic disorder. In our analysis, the distinction between Groups 1 and 2 was associated with large cost differences in the pre-study year and 1st year (63 and 71% higher in Group 1 in the pre-study year and 1st year, respectively). Groups 2 and 3 differed by the cut-off point for “relevant depressive symptoms” on the CES-D, German version [23]. The cut-off point for the original English CES-D version has been used to define depression in several cost analyses [30, 58]. The cut-off value for the German version (24 points) is one standard deviation above the average score in the German population and is a highly sensitive marker for depressive symptoms, insofar as 94% of patients with a DSM-III-R diagnosis of an acute depressive episode but only 17% of the general population will have elevated CES-D scores. On the other hand, the specificity of high CES-D scores for a depression diagnosis is low [23]. In our analysis, the distinction between Groups 2 and 3 was associated with modest and non-significant costs differences (24 and 5% higher in Group 2 in the pre-study year and 1st year, respectively). Groups 1 and 3 differed regarding the physician’s decision to start treatment for depressive symptoms (100 vs 12% of patients) and our criteria for depression (100 vs 0%). These criteria included the CES-D cut-off value (see above) and the core criteria for DSM-IV dysthymic disorder [24]. In our analysis, this distinction was associated with the largest cost differences (102 and 80% higher in Group 1 in the pre-study year and 1st year, respectively).

A limitation of the study documentation of inpatient hospitalisation is that hospital specialisation was not recorded. German psychiatric and neurological hospitals have average 33% lower costs than general hospitals. Accordingly, we performed an extreme-scenario sensitivity analysis, assuming all hospital days in depressed patients (Groups 1 and 2) were spent in psychiatric or neurological hospitals. This resulted in only modest reductions of total costs (range 5–13% for all time periods).

Like most depression cost analyses conducted outside the United States [3], our study was based on patient self-reporting of resource use. All resource items were documented over 3 years and continuous diary-keeping was not feasible. Therefore, patient documentation of use during the preceding 12 months (at study enrolment) and preceding 6 months (at 6-, 12-, 18-, and 24-month follow-up) might be affected by recall bias. Since patient recall of resource use declines over time with a net tendency towards under-reporting [59], under-reporting is more likely for the 12-month pre-study period than for the shorter periods after study enrolment. Therefore, cost reductions from the pre-study year to subsequent years might be larger than reported (and cost increases may be lower than reported).

Cost comparisons between the three groups were adjusted for ten potentially relevant variables pertaining to socio-demographics, comorbidity, and baseline health status. Four of these variables were not significant predictors in any models, and of the remaining six variables, only three measures of baseline health status (disease severity, SF 36 Physical Component, SF-36 Health Change) differed significantly between the groups. Our models are of course, like all statistical adjustment models, imperfect representations of reality [46]; they do not prove causality, and residual confounding cannot be ruled out. Furthermore, the confidence intervals for unadjusted (Table 4) as well as adjusted (Table 6) cost estimates were relatively wide, particularly in Group 1. This may be due to the skewed cost distribution in all three groups, and a relatively modest sample size for the analysis of skewed data in Group 1 [60]. Among the three main models in the present analysis, the best model fit was observed for the 1st study year, while the models for the pre-study year and the 2nd year showed signs of some overdispersion. This finding could have several possible explanations: as noted above, the longer documentation period for resource use in the pre-study year could lead to more recall bias and more errors in resource documentation for this period. Also, since most of the significant predictors in the cost models referred to the baseline status, predictors could have a more direct relationship to costs in the ensuing 1st year than to costs in the pre-study year and in the 2nd year.

Agreement with other studies, and interpretation

The patients treated for depression in this study (Group 1) differed from other depression cohorts in two aspects: at enrolment, 100% of patients in Group 1 started artistic or complementary therapies (AM), combined with antidepressants and/or psychotherapy in 45% of patients [18]. In other treated depression cohorts, 100% of patients usually start treatment with antidepressants and/or psychotherapy, while up to 50% of subjects with depression reportedly use complementary therapies [12, 13]. In Group 1, complementary treatment (AM and non-AM) amounted to 12% of all costs in the 1st year, while conventional outpatient treatment amounted to 22%, and inpatient hospital treatment amounted to 35% of costs. A few other depression cost studies have included costs for complementary treatment [1416], but none allow for direct comparison to our study, as these costs were either not presented separately [14, 16] or because the only costs for complementary treatment included were costs for spinal manipulation [15], which is classified as conventional treatment in Germany. The costs of complementary treatment for depression in this study (12% of total costs and 18% of treatment costs in the 1st year) suggest that depression cost analyses in settings with significant use of complementary therapies should account for the costs of complementary treatment. Another difference to other studies was observed regarding the female-to-male ratio, which was much larger in Group 1 with 90% primary care patients (7.1/1.0) than in other German primary care depression cohorts (1.3–3.0/1.0) [6165]. Possibly, more women than men with depression might be motivated to engage in artistic or complementary therapies. Otherwise, patients in Group 1 were recruited from all but one German federal states and resembled other primary care depression cohorts regarding symptom severity and functional impairment [18]. Nevertheless, study results might not be generalisable to men and to patients receiving conventional care only.

Healthcare costs in the 1st study year in Group 1 were similar to corresponding costs in another German primary care depression cohort [50]. The relative contribution of outpatient, hospital, and sick-leave costs in this study as well as the skewed cost distribution are also similar to findings from other studies [3, 16, 56, 66].

In our study, adjusted costs in patients with depression (Group 1) were twice as high as costs in patients without depression (Group 3) in the pre-study year and 80% higher in the 1st year. This large difference is in accordance with findings from several other studies with follow-up periods between 6 and 12 months [52, 54, 55, 67, 68].

Among study patients with relevant depressive symptoms, those starting treatment for chronic depression (Group 1) had higher costs than patients starting treatment for another chronic disorder (Group 2). The differences were large in the pre-study year and 1st year (63 and 71%, respectively) and were significant in the pre-study year, but were not significant in the 1st year except if outliers were removed from the analysis. It is difficult to explain this difference by differences in baseline CES-D scores, since baseline CES-D differed by only 10% or 3.0 points (34.4 and 31.4 points in Groups 1 and 2, respectively), and since there was no significant correlation between baseline CES-D and costs in these two groups. Notably, all patients in Group 1 had at least three core depression symptoms at baseline, while fulfilment of this criterion was not documented for Group 2. At baseline, a physician’s diagnosis of depression (ICD-10 F30-39) was documented in 100% and 25% of patients in Groups 1 and 2, respectively, while physicians referred to new therapy for depression in 100% and 0% of patients, respectively. Possibly, in depressed patients, the physician’s decision to initiate new therapy for depression is an independent marker for high costs. We are not aware of other cost studies comparing depressed primary care patients starting treatment either for depression or for another disorder.

In our study, costs of patients treated for depression (Group 1) in the 2nd study year were decreased by 27% (unadjusted) respectively 31% (adjusted), compared to the pre-study year, and reduced by 28 and 18%, respectively, compared to the 1st year. We are aware of three other depression cost analyses providing cost data for at least 2 consecutive years [6, 10, 51]: in 290 primary care patients starting antidepressant therapy for major depression, health-care costs were analysed according to clinical status after 12 months, whereby 41% of patients had undergone remission, 47% were improved, and 12% had persistent depression. Compared to the 1st year, costs in the 2nd year were reduced by 48% and 33% in improved and remitted patients, respectively, while costs in patients with persistent depression were increased by 169% [6]. This would correspond to a 26% cost reduction in the whole sample. In a randomised trial of systematic depression treatment for 329 patients with diabetes and coexisting major depression, costs for outpatient treatment in the 2nd year were reduced by 6% in the intervention group and increased by 12% in a usual care control group, compared to the 1st year [10]. In 492 subjects in a health maintenance organisation with high healthcare costs in 2 consecutive years, direct costs in the 3rd year were increased by 5% in depressed subjects and reduced by 21% in non-depressed subjects, compared to the 1st year [51].

The cost reduction in Group 1 could be attributed largely to a reduction in hospital costs and long-term sick-leave. The reduction in resource use was paralleled by an improvement in depressive symptoms and health status [18] and might thus be related to the complementary and conventional therapies used. Since this study is a comparison by diagnosis in treated patients without an untreated control group, one has to consider other causes for these changes, such as natural recovery, regression to the mean, and secular trends: for improvement in symptoms, regression to the mean due to sample truncation of Group 1 (CES-D ≥ 24 points at inclusion) or due to symptom fluctuation and preferential self-selection to therapy and study inclusion at symptom peaks is a possible explanation [18, 69]. However, these phenomena relate to symptoms not costs, and baseline symptoms (CES-D) did not correlate with costs. Moreover, when Group 1 was reanalysed without truncation (second sensitivity analysis, with extension of eligibility criteria to include patients with baseline CES-D scores < 24 points) costs were comparable to costs in the main analysis. Altogether, regression to the mean cannot be ruled out but seems unlikely as a cause of the cost reduction in Group 1.

The cost reduction in Group 1 cannot be explained by secular trends towards a reduction in inpatient hospital days and long-term sick-leave during the study (1999–2003). In this period, the average number of hospital days per person-year in Germany decreased by only 0.21 days (from 2.07 to 1.86 days) [70]. This reduction of 0.11 days per 2 years corresponds to only 3% of the observed reduction of 8.26 days per 2 years (from pre-study year to 2nd year: from 10.37 to 3.42 days) in Group 1. In the same period, the annual number of sick-leave days paid by the statutory health insurance (sick-leave beyond 42 days) was reduced by 0.35 days (from 5.23 to 4.88 days) [71], i.e. by 0.18 days per 2 years, corresponding to only 3% of the reduction in Group 1 of 12.13 days (from 29.24 to 17.11 days).

Conclusions

This study in a complementary outpatient setting underlines the importance of depression for health costs and suggests that treatment of depression could be associated with long-term reduction of costs.