Elsevier

General Hospital Psychiatry

Volume 37, Issue 6, November–December 2015, Pages 567-576
General Hospital Psychiatry

Primary Care-Psychiatry
Screening and case finding for major depressive disorder using the Patient Health Questionnaire (PHQ-9): a meta-analysis

https://doi.org/10.1016/j.genhosppsych.2015.06.012Get rights and content

Abstract

Objective

The Patient Health Questionnaire (PHQ-9) is a widely used screening tool for major depressive disorder (MDD), although there is debate surrounding its diagnostic properties. For the PHQ-9, we aimed to:

1. Establish the diagnostic performance at the standard cutoff point (10).

2. Compare the diagnostic performance at the standard cutoff point in different clinical settings.

3. Assess whether there is selective reporting of cutoff points other than 10.

Methods

We searched three databases — Embase, MEDLINE and PSYCHInfo — and performed a reverse citation search in Web of Science. We selected for inclusion studies of any design that assessed the PHQ-9 in adult populations against recognized gold-standard instruments for the diagnosis of either Diagnostic and Statistical Manual of Mental Disorders or International Classification of Diseases criteria for major depression. Included studies had to report sufficient information to calculate 2*2 contingency tables. Data extraction and synthesis were performed independently by two researchers. For the included studies, we calculated pooled sensitivity, pooled specificity, positive likelihood, negative likelihood ratio and diagnostic odds ratio for cutoff points 7 to 15.

Results

Thirty-six studies (21,292 patients) met inclusion criteria. Pooled sensitivity for cutoff point 10 was 0.78 [95% confidence interval (CI), 0.70–0.84], and pooled specificity was 0.87 (95% CI, 0.84–0.90). At this cutoff, the PHQ-9 is a better screener in primary care than secondary care settings. No conclusions could be drawn at cutoff points other than 10 due to selective reporting of data.

Conclusions

For MDD, the PHQ-9 has acceptable diagnostic properties at cutoff point 10 in different settings. We recommend that future studies report the full range of cutoff points to allow exploration of optimal cutoff points in different settings.

Introduction

Major depressive disorder (MDD) has a high prevalence in the general population and is associated with considerable morbidity, as well as a high financial cost to society [1]. The Patient Health Questionnaire (PHQ-9) is a self-report tool for screening and case finding for MDD and is based on the Primary Care Evaluation of Mental Disorders, a diagnostic tool developed in the mid-1990s. It is widely used in both clinical and research settings. An indication of its importance comes from its recommendation as a measurement tool for depressive symptoms by the most recent iteration of the Diagnostic and Statistical Manual of Mental Disorders (DSM, Fifth Edition).

Four systematic reviews and meta-analyses previously evaluated the diagnostic properties of the PHQ-9. One of these [2] evaluated how the instrument performs in primary care settings and compared the algorithm scoring method with the summed score ≥ 10. A meta-analysis published in 2015 by Manea et al. examined the psychometric properties of the PHQ-9 using the algorithm scoring method and compared this scoring method in different settings with the summed score method at cutoff point of 10 [3]. Another review conducted by Gilbody et al. (published in 2007) summarized the diagnostic properties of the PHQ-9 in different settings [4]. The authors of this review also attempted to summarize the psychometric properties of the PHQ-9 at alternative cutoff points; however, not enough validation studies were found at the time. This analysis was subsequently carried out by Manea et al. in 2012 [5]. This diagnostic meta-analysis has suggested that the performance of the instrument at cutoff point 10 may be lower than that observed in the original validation study. The authors also suggested that different cutoff points may be required for different settings. It is therefore important to examine the performance of other cutoff points which is one of the aims of this review. The review by Manea et al. also highlighted the possibility that there may be selective reporting of cutoff points and that this may artificially inflate the observed diagnostic performance of the measure, at least for cutoff points other than the standard one, which tends to be reported by all studies.

On the basis of this, the current review has three aims: firstly, to establish the diagnostic performance of the PHQ-9 at the standard cutoff point (given the popularity of the PHQ-9, the number of studies available to assess this has grown rapidly since the previous review); secondly, to compare the diagnostic performance of the PHQ-9 at the standard cutoff point in different clinical settings; thirdly, to assess whether there is selective reporting of cutoff points for cutoffs other than 10.

Section snippets

Search strategy

We searched Embase, MEDLine and PSYCHInfo from 1999 (when the PHQ-9 was issued) to September 2013 using the terms “PHQ-9,” “PHQ,” “PHQ$” and “patient health questionnaire.” We manually searched the reference lists of studies fitting the inclusion criteria and performed a reverse citation search in Web of Science. We contacted authors of unpublished studies and conference abstracts in an attempt to minimize publication bias. The search was performed by two independent reviewers (A.M. and L.M.),

Results

After removing the duplicates, we screened 4513 records for eligibility. Full text was reviewed for 65 papers that met initial inclusion criteria. Thirty-six of 65 met final-stage inclusion criteria. Study selection is summarized in the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) flowchart in Fig. 1, and further details about the reasons for exclusion are given in Appendix 1.

Main findings

Since the first diagnostic meta-analysis of the diagnostic accuracy of the PHQ-9 at different cutoff points was conducted, the number of validation studies that fulfilled the inclusion criteria has doubled. The PHQ-9 has been translated and validated in many languages, countries and settings, and significantly more data were available for this review. However, given that most studies reported a small range of cutoff points, the results for other cutoff points than 10 are more difficult to

Conclusions

The aims of the review were to establish the diagnostic performance of the PHQ-9 at the standard cutoff point (10), to compare the diagnostic performance of the PHQ-9 at the standard cutoff point in different clinical settings and to assess whether there is selective reporting of cutoff points other than 10.

Our results further support the conclusions of the previous meta-analysis that the sensitivity of the PHQ-9 at cutoff point 10 is lower than that reported in the original validation study,

References (49)

  • B. Lowe et al.

    Comparative validity of three screening questionnaires for DSM-IV depressive disorders and physicians' diagnoses

    J Affect Disord

    (2004)
  • B.P. Lai et al.

    Detecting postnatal depression in Chinese men: a comparison of three instruments

    Psychiatry Res

    (2010)
  • S. Liu et al.

    Validation of Patient Health Questionnaire for depression screening among primary care patients in Taiwan

    Compr Psychiatry

    (2011)
  • Y. Zhang et al.

    Measuring depressive symptoms using the patient health questionnaire-9 in hong kong chinese subjects with type 2 diabetes

    J Affect Disord

    (2013)
  • F. Lamers et al.

    Summed score of the Patient Health Questionnaire-9 was a reliable and valid method for depression screening in chronically ill elderly patients

    J Clin Epidemiol

    (2008)
  • K. Wittkampf et al.

    The accuracy of Patient Health Questionnaire-9 in detecting depression and measuring depression severity in high-risk groups in primary care

    Gen Hosp Psychiatry

    (2009)
  • R. Navines et al.

    Depressive and anxiety disorders in chronic hepatitis C patients: reliability and validity of the Patient Health Questionnaire

    J Affect Disord

    (2012)
  • M. Valenstein et al.

    The cost–utility of screening for depression in primary care

    Ann Intern Med

    (2001)
  • S. Gilbody et al.

    Screening for depression in medical settings with the Patient Health Questionnaire (PHQ): a diagnostic meta-analysis

    J Gen Intern Med

    (2007)
  • L. Manea et al.

    Optimal cut-off score for diagnosing depression with the Patient Health Questionnaire (PHQ-9): a meta-analysis [references]

    Can Med Assoc J

    (2012)
  • K. Kroenke et al.

    The PHQ-9: validity of a brief depression severity measure

    J Gen Intern Med

    (2001)
  • S.D. Walter

    Properties of the summary receiver operating characteristic (SROC) curve for diagnostic test data

    Stat Med

    (2002)
  • J.P.T. Higgins et al.

    Measuring inconsistency in meta-analyses

    Br Med J

    (2003)
  • S.G. Thompson et al.

    How should meta-regression analyses be undertaken and interpreted?

    Stat Med

    (2002)
  • Cited by (233)

    View all citing articles on Scopus
    View full text