Are self-report scales as effective as clinician rating scales in measuring treatment response in routine clinical practice?

doi:10.1016/j.jad.2017.08.024

Journal of Affective Disorders

Volume 225, 1 January 2018, Pages 449-452

https://doi.org/10.1016/j.jad.2017.08.024 Get rights and content

Highlights

•
Recent treatment guidelines have suggested that outcome should be measured in routine clinical practice.
•
We compared 3 self-report and 2 clinician scales of depressive symptoms in evaluating outcome in routine practice.
•
The magnitude of change of depressive symptoms is as great on self-report scales as on clinician rating scales.

Abstract

Objective

Recent treatment guidelines have suggested that outcome should be measured in routine clinical practice. In the present report from the Rhode Island Methods to Improve Diagnostic Assessment and Services (MIDAS) project, we compared three self-report scales of depressive symptoms and the two most widely used clinician administered scales in treatment studies in their sensitivity to change and evaluation of treatment response in depressed patients treated in routine practice.

Methods

At baseline and 4-month follow-up 153 depressed outpatients with DSM-IV MDD completed the Clinically Useful Depression Outcome Scale (CUDOS), Quick Inventory of Depressive Symptomatology—Self-report version (QIDS-SR), and Patient Health Questionnaire (PHQ-9). The patients were rated on the 17-item Hamilton Depression Rating Scale (HAMD) and the Montgomery-Asberg Depression Rating Scale (MADRS). On each scale treatment response was defined as a 50% or greater reduction in scores from baseline.

Results

While there were some differences in the percentage of patients considered to be responders on the different scales, a large effect size was found for each scale, with little variability amongst the scales. The level of agreement between the three self-report scales and the clinician rating scales was approximately the same

Limitations

The present study was conducted in a single clinical practice in which the majority of the patients were white, female, and had health insurance.

Discussion

When measuring outcome in clinical practice the magnitude of change in depressive symptoms is as great on self-report scales as on clinician rating scales.

Introduction

In psychiatry, quantified assessments of outcome are not the standard of care. Instead, in mental health clinical settings outcome evaluations are typically based on unstructured interactions that yield unquantified judgments of progress. This is at variance with other areas of medical care in which outcome is determined, in part, on the change of a numerical value. Body temperature, blood pressure, cholesterol values, blood sugar levels, cardiac ejection fraction, thyroid stimulating hormone levels, and white blood cell counts are examples of quantifiable variables that are used to evaluate treatment progress. Quantifiable outcome measures exist for most major psychiatric disorders, yet they are rarely used in routine clinical practice (Gilbody et al., 2002, Zimmerman and McGlinchey, 2008).

The quantitative measurement of treatment outcome has long been an integral component of research investigations of the efficacy and effectiveness of care. Recently, some investigators and treatment guidelines have suggested that measurement tools should be used to monitor the course of treatment in clinical practice (American Psychiatric Association, 2010, Harding et al., 2011, National Collaborating Centre for Mental Health, 2009, Trivedi et al., 2006). A better understanding of the effectiveness of psychiatric treatment in clinical practice depends, in part, on systematically measuring outcome. To accomplish this, reliable, valid, informative, and user-friendly scales are necessary. Clinicians are already overburdened with paperwork, and adding to this load by suggesting repeated detailed evaluations with such instruments as the Hamilton Rating Scale for Depression (HAMD) (Hamilton, 1960) or the Montgomery Asberg Depression Rating Scale (MADRS) (Montgomery and Asberg, 1979) is unlikely to meet with success. Clinician-rated scales are time consuming, require training to ensure the ratings are reliable and valid, and may be prone to clinician bias. Self-report questionnaires are inexpensive in terms of professional time needed for incorporation into the clinical encounter, they do not require special training for administration, and they correlate highly with clinician ratings. With modern technology, computer administered self-report assessments enable the conduct of large-scale outcome studies in clinical practice at low cost (Zimmerman and Martinez, 2012). Moreover, self-report scales are free of clinician bias, and are therefore free from the potential risk of clinician overestimation of patient improvement (which might occur when there is incentive to document treatment success).

A meta-analysis of treatment studies of depression found that effect sizes of treatment as assessed by self-administered scales were smaller than the effect sizes as assessed by clinician-rated measures (Cuijpers et al., 2010). Little research has compared the effect sizes of self-report and clinician rated scales in routine clinical practice. While many self-report scales have been developed to measure the severity of depression (Nezu et al., 2000) Zimmerman et al. (2008b), in discussing the use of self-report scales in routine clinical practice, recommended measures that assess the DSM-IV criteria for major depressive disorder (MDD) that are available for clinical use at no cost. Several such scales exist (Bech et al., 2001, Kroenke et al., 2001, Rush et al., 2003, Rush et al., 1996, Zimmerman et al., 2008a, Zimmerman et al., 2004). In consideration of increasing calls to demonstrate the effectiveness of treatment in routine practice, and the lower clinical burden imposed by self-report scales compared to clinician-rated scales, it is important to determine if the method of assessing outcome will significantly influence conclusions about the degree of treatment effectiveness.

Accordingly, in the present report from the Rhode Island Methods to Improve Diagnostic Assessment and Services (MIDAS) project, we compared three self-report scales assessing the DSM-IV symptom criteria for MDD and the 2 most widely used clinician administered scales in their sensitivity to change and evaluation of treatment response in depressed patients treated in routine practice.

Section snippets

Methods

One hundred fifty-three patients diagnosed with DSM-IV MDD who presented for treatment to the Rhode Island Hospital Department of Psychiatry outpatient practice (n = 78), or who were in ongoing treatment and had their medication changed due to lack of efficacy (n = 75), were evaluated at baseline and at 4-month follow-up. The mean interval between the baseline and follow-up evaluations was 16.4 weeks (SD = 4.2 weeks). Not all available patients participated in the study due to the lack of

Results

There was no difference in the amount of change in the patients who presented for treatment versus those in ongoing treatment who had their medication changed therefore the data from these 2 groups was combined. On each scale, the patients showed significant levels of improvement from baseline to follow-up (Table 1). A large effect size was found for each scale (Table 1), with little variability amongst the scales.

All correlations between the scales in change in scores from baseline to 4 months

Discussion

In the past few years there have been increasing calls for the utilization of such standardized measures to assess outcome in clinical practice (American Psychiatric Association, 2010, Harding et al., 2011, Morris and Trivedi, 2011), and it is likely that self-report scales are more likely to be used than clinician rated scales such as the HAMD and MADRS. The results of the present study found that when measuring outcome in clinical practice that the magnitude of change in depressive symptoms

Acknowledgments

None.

References (30)

P. Bech et al.
The sensitivity and specificity of the major depression inventory, using the present state examination as the index of diagnostic validity
J. Affect. Disord.
(2001)
M.M. Biggs et al.
A comparison of alternative assessments of depressive symptom severity: a pilot study
Psychiatry Res.
(2000)
J.D. Carter et al.
The relationship of demographic, clinical, cognitive and personality variables to the discrepancy between self and clinician rated depression
J. Affect. Disord.
(2010)
P. Cuijpers et al.
Self-reported versus clinician-rated symptoms of depression as outcome measures in psychotherapy research on depression: a meta-analysis
Clin. Psychol. Rev.
(2010)
M. Domken et al.
What factors predict discrepancies between self and observer ratings of depression?
J. Affect. Disord.
(1994)
B.W. Dunlop et al.
Concordance between clinician and patient ratings as predictors of response, remission, and recurrence in major depressive disorder
J. Psychiatr. Res.
(2011)
A. Rush et al.
The 16-item quick inventory of depressive symptomatology (QIDS), clinician rating (QIDS-C), and self-report (QIDS-SR): a psychometric evaluation in patients with chronic major depression
Biol. Psychiatry
(2003)
A.J. Rush et al.
An evaluation of the quick inventory of depressive symptomatology and the Hamilton Rating Scale for Depression: a sequenced treatment alternatives to relieve depression trial report
Biol. Psychiatry
(2006)
M. Zimmerman et al.
A clinically useful depression outcome scale
Compr. Psychiatry
(2008)
M. Zimmerman et al.
Have treatment studies of depression become even less generalizable?: a review of the inclusion and exclusion criteria in placebo controlled antidepressant efficacy trials published during the past 20 years
Mayo Clin. Proc.
(2015)

American Psychiatric Association

Practice Guideline for the Treatment of Patients With Major Depressive Disorder

(2010)

P. Bech

Meta-analysis of placebo-controlled trials with mirtazapine using the core items of the Hamilton Depression Scale as evidence of a pure antidepressive effect in the short-term treatment of major depression

Int. J. Neuropsychopharmacol.

(2001)

P. Bech et al.

HAM-D17 and HAM-D6 sensitivity to change in relation to desvenlafaxine dose and baseline depression severity in major depressive disorder

Pharmacopsychiatry

(2010)

J. Cohen

Statistical Power Analysis for the Behavioral Sciences

(1988)

M.B. First et al.

Structured Clinical Interview for DSM-IV Axis I Disorders - Patient edition (SCID-I/P, version 2.0)

(1995)

Cited by (33)

A comparison of self- and observer-rated scales for detecting clinical improvement during repetitive transcranial stimulation (rTMS) treatment of depression
2023, Psychiatry Research
Clinical outcomes of repetitive Transcranial Magnetic Stimulation (rTMS) for treatment of Major Depressive Disorder (MDD) vary widely, and no single mood rating scale is standard for assessing rTMS outcomes. This study of 708 subjects undergoing clinical rTMS compared the performance of four scales in measuring symptom change during rTMS treatment. Self-report and observer ratings were examined weekly with the Inventory of Depressive Symptomatology 30-item (IDS), Patient Health Questionnaire 9-item (PHQ), Profile of Mood States 30-item (POMS), and Hamilton Depression Rating Scale 17-item (HDRS). While all scales were correlated and detected significant improvement, the degree of improvement over time as well as response (33–50%) and remission (20–24%) rates varied significantly. Higher baseline severity was associated with lower likelihood of remission, and greater improvement by sessions 5 and 10 predicted response across all scales. Use of only a single scale to assess outcome conferred 14–36% risk of failing to detect response/remission indicated by another scale. The PHQ was most likely to indicate improvement and least likely to miss response or remission. These findings indicate that assessment of symptom burden during rTMS treatment may be most accurately assessed through use of multiple instruments.
Analysis of the performance of assessment scales with multi-criteria decision-making techniques
2023, Journal of Engineering Research (Kuwait)
The correct transmission of the information flow between the patient and the doctor by the assessment scales plays an important role in the correct determination of the treatment approach to be applied. In this study, depression assessment scales were evaluated using a hybrid Multi-Criteria Decision-Making (MCDM) model. In the first part of the study, the criteria that should be in the assessment scale were determined by a questionnaire study applied to doctors. In the second part, criteria weights were calculated with the Analytical Hierarchy Process (AHP), and the assessment scales were evaluated from various aspects with the Weighted Aggregated Sum Product Assessment method (WASPAS). The results show that the most important criteria in a scale are the ability to predict the diagnosis, the follow-up of the recovery process, the understanding of the questions, the ease of evaluation, and the ease of individual use. In addition, it has been observed that psychiatrists attach more importance to the follow-up of the recovery process and the ease of evaluation on a scale. When the findings are examined with sensitivity analysis, it shows that the Beck depression scale is the most appropriate scale in terms of the determined criteria.
The network structure of self-reported psychopathological dimensions in common mental disorders (CMDs)
2023, European Journal of Psychiatry
Citation Excerpt :
There are also potential biases from the questionnaire itself, such as misunderstanding of statements, or ceiling- or floor-effect scorings. However, self-report questionnaires not only are less-time consuming for practitioners but also they might provide complementary and valious information not gathered in clinical interview and they are non-expensive and relatively easy to obtain.47,48 Our study has a cross-sectional design that precluded any basis to establish causal relationships or dynamic interactions between psychopathological domains.
Common mental disorders (CMDs) in mental health settings show high rates of comorbidities. While semi-structured interviews are the gold standard to establish a diagnosis, there are self-report instruments such as the Psychiatric Diagnostic Screening Questionnaire (PDSQ) that aids clinicians in improving the diagnostic process in a time-efficient manner.
Network analysis of the 13 domains of the PDSQ was applied to a sample of 374 first-contact outpatients to identify domains of psychopathology acting as hubs and bridges of interconnections within the CMDs.
A global network densely connected with positive connections among PDSQ domains was found. The global network has four main clusters: depression-anxiety, somatoform, psychosis and substance-related domains. This network allowed for the identification of main ‘nodes’ acting as hubs favoring interconnections between dimensions and main ‘bridges’ easing the connections between clusters.
The network structure of the PDSQ domains might provide a complementary explanation to the high rates of comorbidity among CMDs. Moreover, our results support the relevance of the self-administered PDSQ inventory to account for a deeper understanding of comorbidities among CMDs.
Telehealth treatment of patients with major depressive disorder during the COVID-19 pandemic: Comparative safety, patient satisfaction, and effectiveness to prepandemic in-person treatment
2023, Journal of Affective Disorders
The COVID-19 pandemic impelled a transition from in-person to telehealth psychiatric treatment. There are no studies of partial hospital telehealth treatment for major depressive disorder (MDD). In the present report from the Rhode Island Methods to Improve Diagnostic Assessment and Services (MIDAS) project, we compared the effectiveness of partial hospital care of patients with MDD treated virtually versus in-person.
Outcome was compared in 294 patients who were treated virtually from May 2020 to December 2021 to 542 patients who were treated in the in-person partial program in the 2 years prior to the pandemic. Patients completed self-administered measures of patient satisfaction, symptoms, coping ability, functioning, and general well-being.
In both the in-person and telehealth groups, patients with MDD were highly satisfied with treatment and reported a significant reduction in symptoms from admission to discharge. Both groups also reported a significant improvement in positive mental health, general well-being, coping ability, and functioning. A large effect size of treatment was found in both treatment groups. Contrary to our hypothesis, the small differences in outcome favored the telehealth-treated patients. The length of stay and the likelihood of staying in treatment until completion were significantly greater in the virtually treated patients.
The treatment groups were ascertained sequentially, and telehealth treatment was initiated after the COVID-19 pandemic began. Outcome assessment was limited to a self-administered questionnaire.
In an intensive acute care setting, delivering treatment to patients with MDD using a virtual, telehealth platform was as effective as treating patients in-person.
Evolution of sexual functioning of men through treated and untreated depression
2022, Encephale
Citation Excerpt :
Also, for mood disorder as MDD the importance of functional outcomes treatment have already been reported in the literature [28]. Furthermore patient-reported evaluation have shown to be as effective as clinician rating scales (as MADRS, HAM-D) for MDD [29]. Despite these arguments, our study has some limitations.
Depression as well as a treatment by antidepressant are factors that may interfere with sexuality. Due to this complex relationship between depression, antidepressant and sexuality, it is difficult to incontestably establish the exclusive accountability of a treatment or of a psychiatric disorder on sexual dysfunctions. The main purpose of the SADD (for Sexuality, Anti-Depressant and Depression) study is to evaluate sexual dysfunctions in depressed men treated with antidepressant or not.
Participants of this transversal, observational study were men aged over 18 years old, suffering from unipolar major depressive disorder and treated by a psychiatrist, with or without antidepressant. Assessment of sexual functioning through three times: euthymia (before depression), untreated depression and treated depression if applicable was performed based on the ASEX scale.
Seventy patients were included. Eight percent of euthymic patients presented a sexual dysfunction (average score on the ASEX = 12.4) whereas 56% of untreated patients presented a sexual dysfunction (average total score on the ASEX = 17.7) and 62% (34/55) of patients treated with antidepressant (average total score on ASEX = 18.5) (P < 0.001). Sexual functioning of men receiving treatment is not significantly different to that among men not receiving any antidepressant, even if patients treated with antidepressant reported that they had a better mood than those untreated.
Our results reveal a high prevalence of sexual dysfunction within the framework of major depressive disorder and its treatment and underlines the complex relationship between major depressive disorder, antidepressant and sexuality.
La dépression ainsi qu’un traitement par antidépresseur sont des facteurs pouvant interférer avec la sexualité. En raison de cette relation complexe entre dépression, antidépresseur et sexualité, il est difficile d’établir incontestablement la responsabilité exclusive d’un traitement ou d’un trouble psychiatrique sur les dysfonctions sexuelles. L’objectif principal de l’étude SADD (pour la sexualité, les antidépresseurs et la dépression) est d’évaluer les dysfonctions sexuelles chez les hommes souffrant d’un épisode dépressif majeur traités par antidépresseur ou non.
Les participants à cette étude observationnelle transversale étaient des hommes âgés de plus de 18 ans, souffrant d’un trouble dépressif majeur unipolaire, avec ou sans antidépresseur et inclus lors d’une consultation par un psychiatre. Les données médicales et sociodémographiques étaient recueillies lors de la consultation d’inclusion. L’évaluation du fonctionnement sexuel pendant trois différentes périodes: euthymie (avant dépression), dépression non traitée et dépression traitée le cas échéant a été réalisée sur la base de l’échelle Arizona Sexual Experience Scale (ASEX). L’évaluation était réalisée en un temps unique et donc rétrospective pour les périodes d’euthymie et de dépression précédant l’évaluation. L’ASEX est une échelle validée chez les hommes déprimés qui étudie de nombreux domaines du fonctionnement sexuel. L’ASEX a 5 questions relatives à différents aspects du fonctionnement sexuel : le désir, l’excitation, l’érection du pénis, l’orgasme et la satisfaction. Les autres données recueillies étaient l’évaluation de la satisfaction sexuelle par le patient selon 5 réponses possibles allant de très satisfait à très insatisfait et la thymie évaluée par le patient avec une échelle numérique de 0 (humeur la plus mauvaise possible) à 10 (euthymie). L’objectif principal était le fonctionnement sexuel dans une population d’hommes souffrant d’un trouble dépressif majeur durant trois périodes : euthymie (avant dépression), dépression non traitée et dépression traitée avec des antidépresseurs le cas échéant. Les objectifs secondaires étaient l’évaluation de la satisfaction sexuelle autodéclarée à l’égard de la vie sexuelle et de l’humeur autodéclarée. L’évolution de la fonction sexuelle a été étudiée, selon les trois périodes définies ci-dessus (euthymie, dépression traitée et dépression non traitée). Il était basé sur la comparaison des scores obtenus de l’ASEX. Dans un premier temps, nous avons utilisé le test Silmack afin de comparer globalement les 3 périodes. Deuxièmement, nous avons comparé les périodes 2 à 2 (Wilcoxon et t-test). Le seuil de significativité pour chaque test statistique était fixé à p < 0,05.
Soixante-dix patients ont été inclus. Huit pour cent des patients euthymiques présentaient un dysfonctionnement sexuel (score moyen à l’ASEX = 12,4) tandis que 56 % des patients non traités présentaient un dysfonctionnement sexuel (score total moyen à l’ASEX = 17,7) et 62 % des patients traités avec antidépresseur (score total moyen sur ASEX = 18,5) (p < 0,001). Le fonctionnement sexuel des hommes sous traitement n’est pas significativement différent de celui des hommes ne recevant aucun antidépresseur, même si les patients traités par antidépresseur ont déclaré avoir une meilleure humeur que ceux non traités. Environ les trois quarts des patients ont estimé que leur vie sexuelle actuelle était une source de souffrance pour eux et 54 % ont admis être insatisfaits ou très insatisfaits. Il y avait un pourcentage significativement plus élevé (p < 0,05) de patients avec un score ≥ 5/10 sur l’échelle de l’humeur d’auto-évaluation dans le groupe traité par antidépresseurs (74 %) par rapport au groupe non traité (39 %).
Nos résultats révèlent une forte prévalence de dysfonctionnement sexuel dans le cadre du trouble dépressif majeur et de son traitement et soulignent la relation complexe entre trouble dépressif majeur, antidépresseur et sexualité.
Not sad enough for a depression trial? A systematic review of depression measures and cut points in clinical trial registrations: Systematic review of depression measures and cut points
2021, Journal of Affective Disorders
Citation Excerpt :
However, there is a debate about whether or not these different modes of administration can be used interchangeably (Uher et al., 2012). Several studies reported acceptable correlations and high levels of agreement between the total scores of these two rating types (e.g. Bech, 1992; Bernstein et al., 2007; Möller, 1991; Rush et al., 2006; Zimmerman et al., 2018), indicating that both types are highly comparable. Other studies found poorer correlations between self-reported and clinician administered outcomes (e.g. Carter et al., 2010; Cuijpers et al., 2010; Enns et al., 2000; Uher et al., 2012), indicating that these two assessment types may not be interchangeable.
: Patient reported outcomes are central to the evaluation of behavioral, drug, or somatic interventions focusing depression. Continuous measures are mostly interpreted with cut points that serve as inclusion criteria and define remission. The present review provides an overview of measures (BDI; BDI-II; CESD; HADS; HAMD-17; MADRS; PHQ-9; QIDS) and cut points in clinical trials on depression and tests for systematic differences concerning varying types of interventions.
: We analyzed 2632 trials registered via clinicaltrials.gov registered between 2000/01/01 - 2019/12/31 that used one or more pre-specified measures of depression of which 1600 reported cut points for either inclusion of participants or the definition of clinical remission.
: The included studies more often used clinician-administered scales than self-report questionnaires as criterion for the inclusion of study participants and for the definition of clinical remission. Clinician administered scales are dominating in drug trials, while self-report questionnaires are primarily used in behavioral trials. This trend accelerated during the last 20 years. Compared to studies on behavioral therapies, studies with drug or other interventions used higher cut points to include patients. Comparisons between the interventions revealed highly significant differences in the used cut points of MADRS, HAMD-17 and PHQ-9.
: Choice of measure and cut points is an important aspect of trial design and should be homogenized in order to make trials of different types of interventions more readily comparable. Similarly, systematic differences between treatment types in how patients are included and how remission is defined also hamper the comparisons between different treatment modalities.

View all citing articles on Scopus

View full text

Research paperAre self-report scales as effective as clinician rating scales in measuring treatment response in routine clinical practice?

Highlights

Abstract

Objective

Methods

Results

Limitations

Discussion

Introduction

Section snippets

Methods

Results

Discussion

Acknowledgments

J. Affect. Disord.

Psychiatry Res.

J. Affect. Disord.

Clin. Psychol. Rev.

J. Affect. Disord.

J. Psychiatr. Res.

Biol. Psychiatry

Biol. Psychiatry

Compr. Psychiatry

Mayo Clin. Proc.

Practice Guideline for the Treatment of Patients With Major Depressive Disorder

Meta-analysis of placebo-controlled trials with mirtazapine using the core items of the Hamilton Depression Scale as evidence of a pure antidepressive effect in the short-term treatment of major depression

Int. J. Neuropsychopharmacol.

HAM-D17 and HAM-D6 sensitivity to change in relation to desvenlafaxine dose and baseline depression severity in major depressive disorder

Pharmacopsychiatry

Statistical Power Analysis for the Behavioral Sciences

Structured Clinical Interview for DSM-IV Axis I Disorders - Patient edition (SCID-I/P, version 2.0)

Research paper
Are self-report scales as effective as clinician rating scales in measuring treatment response in routine clinical practice?