Original Article
The Factor Structure of the SF-36 Health Survey in 10 Countries: Results from the IQOLA Project

https://doi.org/10.1016/S0895-4356(98)00107-3Get rights and content

Abstract

Studies of the factor structure of the SF-36 Health Survey are an important step in its construct validation. Its structure is also the psychometric basis for scoring physical and mental health summary scales, which are proving useful in simplifying and interpreting statistical analyses. To test the generalizability of the SF-36 factor structure, product-moment correlations among the eight SF-36 Health Survey scales were estimated for representative samples of general populations in each of 10 countries. Matrices were independently factor analyzed using identical methods to test for hypothesized physical and mental health components, and results were compared with those published for the United States. Following simple orthogonal rotation of two principal components, they were easily interpreted as dimensions of physical and mental health in all countries. These components accounted for 76% to 85% of the reliable variance in scale scores across nine European countries, in comparison with 82% in the United States. Similar patterns of correlations between the eight scales and the components were observed across all countries and across age and gender subgroups within each country. Correlations with the physical component were highest (0.64 to 0.86) for the Physical Functioning, Role Physical, and Bodily Pain scales, whereas the Mental Health, Role Emotional, and Social Functioning scales correlated highest (0.62 to 0.91) with the mental component. Secondary correlations for both clusters of scales were much lower. Scales measuring General Health and Vitality correlated moderately with both physical and mental health components. These results support the construct validity of the SF-36 translations and the scoring of physical and mental health components in all countries studied.

Introduction

According to cross-cultural psychometric research traditions, the degree to which universal concepts are captured by a measure can be evaluated by examining the extent to which measurement and structural models are replicated cross culturally 1, 2, 3, 4, 5, 6, 7. Measurement models concentrate on the scoring of response choices and hypothesized item groupings and are the basis for scale scoring algorithms. Structural models focus on the relationship of scales to each other and are useful in scale interpretation. They are the psychometric basis for scoring summary measures. To the extent that scales have the same relationships with other scales and with components of health across countries, the meaning and interpretation of scale scores and summary components are more likely to be comparable.

In its protocol for translating and validating the SF-36 Health Survey, the International Quality of Life Assessment (IQOLA) Project adopted a comprehensive three-stage research methodology that examines the extent to which the SF-36 measurement and structural models are replicated cross culturally [8]. Stages 1 (reproduction of the questionnaire) and 2 (reproduction of the scoring algorithms) address the issues of conceptual equivalence of the translation and satisfactory replication of the SF-36 measurement model. Stage 3 focuses on the structural model, including examination of the pattern of correlations among SF-36 scales and the relationship of scales to external variables. A number of studies have tested the structure of the SF-36 in other countries 9, 10, 11, 12, 13, 14, and other studies have evaluated its cross-cultural validity (see Gandek et al. [15] and other articles in this issue). This article uses traditional factor analysis methods to examine the equivalence of the SF-36 structural model in 10 countries. A companion article included in this issue uses a structural equation modeling approach [16].

The SF-36 measures a full range of health states and includes multi-item scales measuring each of eight health concepts: physical functioning (PF), role limitations due to physical health (RP), bodily pain (BP), general health (GH), vitality (VT), social functioning (SF), role limitations due to emotional problems (RE), and mental health (MH) 17, 18. These eight scales have been observed to define distinct physical and mental health clusters in factor analytic studies of both general and patient populations in the United States 19, 20, 21. The physical and mental health components accounted for more than 80% of the reliable variance in SF-36 scale scores and have been very useful in establishing interpretation guidelines for each of the SF-36 scales 19, 20. Results to date from published studies in Australia [9], Italy [10], the Netherlands [11], Sweden [12], Switzerland [13], and the United Kingdom [14] also provided considerable support for the construct validity of the SF-36.

In this article, principal components analysis was used to test the generalizeability of the two-dimensional SF-36 model using identical methods across nine Western European countries. Principal components analysis gauges the congruence between the hypothesized physical and mental health constructs and the SF-36 scales used to measure those constructs. To the degree that the two-dimensional structure is replicated across countries, the construct validity of the SF-36 and the structural model used in scoring summary health measures are valid across countries. The results also have implications for the interpretation of each SF-36 scale as a measure of physical or mental health.

Section snippets

Sample and Data Collection

SF-36 data came from general population surveys in Denmark, France, Germany, Italy, the Netherlands, Norway, Sweden, Spain, the United Kingdom, and the United States. Translations of the SF-36 were developed and tested using a standard methodology operationalized by the IQOLA Project [22]. The IQOLA methodology also specified a standard protocol for collection of normative general population data. Briefly, the protocol included guidelines for sample size, use of representative sampling of

Results

As hypothesized, eigenvalues for the first two components were generally greater than unity (Table 1). In Italy, Spain, and the United Kingdom, eigenvalues for the second component were slightly lower than unity (0.88 to 0.97). Across the 10 countries, 65.6% to 71.6% (median = 69.6%) of the total variance and 76.3% to 84.7% (median = 82.4%) of the reliable variance in SF-36 scale scores were accounted for by the two components.

The total and reliable variance explained in each SF-36 scale by the

Discussion

Overall, these results are consistent with those from other studies 9, 10, 11, 12, 13, 14 and strongly support the generalizeability of the two-dimensional (physical and mental health) model of the SF-36 across the nine Western European countries studied. As previously reported for the United States 18, 19, 20, the interpretation of the two derived components as dimensions of physical and mental health was straightforward and robust across countries and across age and gender subgroups within

References (32)

  • R.B. Cattell

    Comparing factor trait and state scores across ages and cultures

    J Gerontol

    (1989)
  • A.R. Davidson et al.

    Cross-cultural model testingToward a solution of the emic-etic dilemma

    Int J Psychol

    (1976)
  • S.H. Irvine

    Adapting tests to the cultural settingA comment

    Occup Psychol

    (1965)
  • F. van de Vijver et al.

    Methods and Data Analysis for Cross-Cultural Research

    (1997)
  • Ware JE, Gandek B, Keller SD, the IQOLA Project Group. Evaluating instruments used cross-nationally: Methods from the...
  • G. Apolone et al.

    Questionario Sullo Stato di Salute SF-36Manuale d’uso e Guida all’Interpretazione dei Risultati

    (1997)
  • Cited by (576)

    View all citing articles on Scopus
    View full text