An introduction to Rasch analysis for Psychiatric practice and research
Introduction
The use of patient reported outcomes in health care in general, and psychiatry in particular, has seen a rapid expansion over recent years. The ascertainment of latent constructs such as anxiety, depression and self harm has seen a steady increase in the number of instruments designed to measure such attributes (Bowen et al., 2008; Brunner et al., 2007; Fliege et al., 2009; Gamez et al., 2007; Garlow et al., 2008; Honarmand and Feinstein, 2009; King et al., 2008; Klonsky et al., 2003; Latimer et al., 2009; Parker et al., 2005; Pedersen, 2006; Pomerleau et al., 2003; Terluin et al., 2006; Tuisku et al., 2009). While some instruments are administered by professionals, the majority are self completed ‘patient reported outcomes’ and are widely used in both clinical practice and research (Bech, 2008; Chan et al., 2010; Chandler et al., 2010; Counts et al., 2010; Hawton et al., 2002; Norris and Aroian, 2008; Steinhausen et al., 2009). The obvious value of such instruments is that they can minimize the burden of assessment upon patients, and can be applied to large numbers, which may be more restricted, or not feasible in the case of structured clinical interviews.
However, the use of such scales has been the subject of some debate. Marshall et al. (2000), examining a number of controlled trials in schizophrenia, found that the intervention was more likely to be effective when unpublished scales were used, in opposite to validated ones. Another issue, which has been rarely considered, is that the majority of instruments derive ordinal scores, which indicate rank relationships (Stevens, 1946). Such scores are not capable of supporting mathematical calculations such as change scores, or parametric effect sizes (Smith, 2001). Consequently using ordinal scores in sophisticated parametric analyses could lead to misinference of the findings (Merbitz et al., 1989). However, ordinal scales, which provide a magnitude of the trait under consideration, are perfectly acceptable when the object is to identify a cut point, or magnitude of the trait, such as found in many instruments, for example, to ascertain depression. This application just relies on a specific magnitude, which is available from an ordinal scale. Thus, the problem is not necessarily the scale themselves (although it may be), but rather the way in which they are analysed.
In the formation of patient reported outcomes, the usual procedure has been to generate a scale with a certain number of items that intend to assess some observable behaviours related to the construct of interest (Tesio, 2003). Therefore, when setting out to measure such a construct we look for indicators (items) which are related to the construct, preferably in a way to be specified by an underlying theory. When someone responds to a certain question or item, the probability of the subject to endorse the item should depend on their level of the latent trait or ability (Baker, 2001). For example, it is expected that a more depressed subject will endorse an item regarding hopelessness more frequently than a non-depressed one. While this particular item does not directly measure depression (it addresses hopelessness), it helps in the construction of the depression score, together with other related items, which are designed to measure the latent variable (depression in this case).
In order to put together a set of items with the expectation that they measure the target construct, a set of psychometric requirements must be satisfied, and these requirements can be grouped into those associated with Classical Test Theory (CTT), and Modern Test Theory (MTT) (although in practice there is considerable overlap between the two). The present article aims to briefly review the former, and then go on to describe the potential contributions of the latter, in particular Rasch analysis, with respect to the development and testing of instruments. The Beck Depression Inventory (BDI) will be used as a practical example of this purpose.
Section snippets
Classical Test Theory
The measurement properties of most patient reported outcomes to-date have been evaluated from the CTT perspective. This has entailed publication of evidence concerning the reliability and the validity of the instrument. Reliability concerns whether or not the instrument has consistency, both internally (Cronbach's alpha) and over time (test–retest). Validity is often reported to comprise three central aspects, namely construct validity, criterion and content validity. These represent
Modern Test Theory (MTT) and the Rasch model
The first MTT models (under the generic label of Item Response Theory –IRT) appeared in the 1950s in the education area based on the need to build tests that would be at the same time simple, valid and with high discrimination power (Embretson and Reise, 2000). IRT represents a group of several distinct models, which share in common an assumption that the response to any particular item is a function of the difference between the ability of the person (or in our example their level of
An example using the BDI
To illustrate how data are fitted to the Rasch model, data were collected from a sample composed of 122 chronic patients, of whom 66 (54.1%) were male, and 56 (45.9%) were female. The most frequently reported health problems were hypertension (18%), heart diseases (15.6%), neoplasm (13.1%), diabetes (13.1%), emphysema/asthma/bronchitis (11.5%), autoimmune diseases (8.2%), and kidney diseases (8.2%). They were recruited in a tertiary hospital in Porto Alegre-RS-Brazil, in the different clinical
Discussion: Rasch applications in clinical research
This paper is an introductory paper to stress the potentialities of Rasch analysis for Psychiatric practice and research. The BDI was used here merely as an example. The BDI has been shown to satisfy Rasch model expectations after some adjustments, in a mixed diagnostic sample of a tertiary hospital. Designed to be used in a clinical sample of depressed patients to ascertain the severity of that depression, the distribution of thresholds across the continuum of depression is consistent with
Role of the funding source
This study was partially funded by FIPE-HCPA, CAPES and the University of Edinburgh.
Contributors
All authors managed the literature searches. Neusa Rocha and Alan Tennant undertook the statistical analysis, and Neusa Rocha, Eduardo Chachamovich and Marcelo Fleck wrote the first draft of the manuscript. All authors contributed to and have approved the final manuscript.
Conflict of interest
The authors declare that they have no conflict of interest.
Acknowledgements
None.
References (83)
- et al.
Literacy affected ability to adequately discriminate among categories in multipoint Likert Scales
Journal of Clinical Epidemiology
(2009) - et al.
The development and validation of the protective factors survey: a self-report measure of protective factors against child maltreatment
Child Abuse & Neglect
(2010) - et al.
Axis I comorbidity and psychopathologic correlates of autodestructive syndromes
Comprehensive Psychiatry
(2009) - et al.
Abnormal personality and the mood and anxiety disorders: implications for structural models of anxiety and depression
Journal of Anxiety Disorders
(2007) - et al.
Screening for depression: Rasch analysis of the dimensional structure of the PHQ-9 and the HADS-D
Journal of Affective Disorders
(2010) - et al.
Assessing reliability and validity of the arabic language version of the Post-traumatic Diagnostic Scale (PDS) symptom items
Psychiatry Research
(2008) - et al.
Measuring pain: issues of interpretation
The Lancet
(2008) - et al.
Remediating serious flaws in the National Eye Institute Visual Function Questionnaire
Journal of Cataract & Refractive Surgery
(2010) - et al.
Prevalence of self-reported seasonal affective disorders and the validity of the seasonal pattern assessment questionnaire in young adults findings from a Swiss community study
Journal of Affective Disorders
(2009) - et al.
Application of Rasch analysis in the development and application of quality of life instruments
Value in Health
(2004)
Factors associated with deliberate self-harm behaviour among depressed adolescent outpatients
Journal of Adolescence
An IRT validation of the Affective Self Rating Scale
Nordic Journal of Psychiatry
Is the Beck Depression Inventory reliable over time? an evaluation of multiple test–retest reliability in a nonclinical college student sample
Journal of Personality Assessment
Suicidal ideation among students enrolled in healthcare training programs: a cross-sectional study
Revista Brasileira de Psiquiatria
A rating formulation for ordered response categories
Psychometrika
Rasch models for measurements
RUMM: a Windows program for analysing item response data according to Rasch Unidimensional Measurement Models
The basics of item response theory
Pichot – a tribute to the European psychopharmacologist on his 90th birthday
European Psychiatric Review
BDI-II manual
Applying the Rasch model-fundamental measurement in the human sciences
Homogeneity of Beck's Depression Inventory (BDI): applying Rasch analysis in conceptual exploration
Acta Psychiatrica Scandinavica
Anxiety in a socially high-risk sample of pregnant women in Canada
Canadian Journal of Psychiatry
Prevalence and psychological correlates of occasional and repetitive deliberate self-harm in adolescents
Archives of Pediatrics & Adolescent Medicine
Ascertaining late-life depressive symptoms in Europe: an evaluation of the survey version of the EURO-D scale in 10 nations. The SHARE project
International Journal of Methods in Psychiatric Research
Preliminary evidence for the development of a stroke specific geriatric depression scale
International Journal of Geriatric Psychiatry
Development and validation of the Brazilian version of the Attitudes to Aging Questionnaire (AAQ): an example of merging classical psychometric theory and the Rasch measurement model
Health and Quality of Life Outcomes
Psychometric evaluation of the Hospital Anxiety and Depression Scale in a large community sample of adolescents in Hong Kong
Quality of Life Research
RESEARCH: validation of the Massachusetts general hospital Antidepressant Treatment History Questionnaire (ATRQ)
CNS Neuroscience & Therapeutics
The core symptoms of depression in medical and psychiatric patients
Journal of Nervous and Mental Disease
Variability in depression prevalence in early rheumatoid arthritis: a comparison of the CES-D and HAD-D Scales
BMC Musculoskeletal Disorders
Item response theory for psychologists
Rasch models. Foundations, recent developments and applications
On the mathematical foundations of theoretical statistics
Philosophical Transactions of the Royal Society
Development of an item bank for the assessment of depression in persons with mental illnesses and physical diseases using Rasch analysis
Rehabilitation Psychology
Depression, desperation, and suicidal ideation in college students: results from the American Foundation for Suicide Prevention College Screening Project at Emory University
Depression and Anxiety
Rasch analysis of the hospital anxiety and depression scale (HADS) for use in motor neurone disease
Health and Quality of Life Outcomes
Health status measurement in Parkinson's disease: validity of the PDQ-39 and Nottingham Health Profile
Movement Disorders
Deliberate self harm in adolescents: self report survey in schools in England
British Medical Journal
Confirmatory factor analysis of the Beck Depression Inventory in obese individuals seeking surgery
Obesity Surgery
Factor structure of the Beck Depression Inventory in a university sample
Psychological Reports
Cited by (77)
Rasch analysis of the beck depression inventory in a homeless and precariously housed sample
2023, Psychiatry ResearchThe four self-efficacy trajectories among people with multiple sclerosis: Clinical associations and implications
2022, Journal of the Neurological SciencesCitation Excerpt :Data from these PROMs (excluding the EQ-5D-5L, for which utility value is determined by the pattern of responses) were fit to the Rasch measurement model, to provide interval-level latent estimates for parametric analysis. Details of the process of Rasch analysis are described in detail elsewhere [27–30]. Fit of the data to the model was undertaken in a calibration sample consisting of multiple time points where individuals were sampled without replacement, such that no one individual appeared more than once in the sample.
Measuring coping in people with amyotrophic lateral sclerosis using the Coping Index-ALS: A patient derived, Rasch compliant scale
2021, Journal of the Neurological SciencesCitation Excerpt :Where the items have polytomous response options, whether the transition (threshold) from one category to the next reflects an appropriate increase (monotonicity) in the trait being measured is also considered [23]. Full details of the process are given elsewhere [19,24]. For the current analysis, all chi-square and ANOVA-related fit and DIF statistics adopted a Type I error rate of 0.05, Bonferroni adjusted [25].
Sexual Morbidity Assessment in Gyne-Oncology Follow-Up: Development of the Sexual Well-Being After Cervical or Endometrial Cancer (SWELL-CE) Patient-Reported Outcome Measure
2020, Journal of Sexual MedicineCitation Excerpt :Analysis was undertaken in MPlus (Muthén L.K. & Muthén B.O., Los Angeles, CA, USA) based upon a tetrachoric correlation matrix using the unweighted least squares estimation with a Promax rotation.28 Rasch analysis is now widely used in the construction of PROMS.29,30 As Rasch analysis accommodates missing data points within the calibration process, the complete data set was used.