Abstract
Item response theory (IRT) is a powerful tool for the detection of differential item functioning (DIF). It is shown that the class of IRT models with manifest predictors is a comprehensive framework for the detection of DIF. These models also support the investigation of the causes of DIF. In principle, the responses to every item in a test can be subject to DIF, and traditional IRT-based detection methods require one or more estimation runs for every single item. Therefore, (1998) proposed an alternative procedure that can be performed using only a single estimate of the item parameters. This procedure is based on the Lagrange multiplier test or the equivalent Rao efficient score test. In this chapter, the procedure is generalized in various directions, the most important one being the possibility of conditioning on general covariates. A small simulation study is presented to give an impression of the power of the test. In an example using real data it is shown how the method can be applied to the identification of main and interaction effects in DIF.
Keywords
- Lagrange Multiplier
- Differential Item Functioning
- Item Response Theory
- Item Parameter
- Item Response Theory Model
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Aitchison, J., & Silvey, S.D. (1958). Maximum likelihood estimation of parameters subject to restraints. Annals of Mathematical Statistics 29, 813–828.
Bock, R.D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: An application of an EM-algorithm. Psychometrika, 46, 443–459.
Bock, R.D., & Zimowski, M.F. (1997). Multiple group IRT. In W.J. van der Linden & R.K. Hambleton (Eds.), Handbook of modern item response theory (pp. 433–448). New York: Springer-Verlag.
Camilli, G., & Shepard, L.A. (1994). Methods for identifying biased test items. Thousand Oaks, CA: Sage.
Cox, D.R., & Hinkley, D.V. (1974). Theoretical statistics. London: Chapman and Hall.
Cressie, N., & Holland, P.W. (1983). Characterizing the manifest probabilities of latent trait models. Psychometrika, 48, 129–141.
Efron, B. (1977). Discussion on maximum likelihood from incomplete data via the EM algorithm (by A. Dempster, N. Laird, and D. Rubin). Journal of the Royal Statistical Society, Series B, 39, 1–38.
Fischer, G.H. (1993). Notes on the Mantel—Haenszel procedure and another chi-square test for the assessment of DIF. Methodika, 7, 88–100.
Fischer, G.H. (1995). Some neglected problems in IRT. Psychometrika, 60, 459–487.
Glas, C.A.W. (1998). Detection of differential item functioning using Lagrange multiplier tests. Statistica Sinica, 8, 647–667.
Glas, C.A.W. (1999). Modification indices for the 2-PL and the nominal response model. Psychometrika, 64, 273–294.
Glas, C.A.W., & Verfielst, N.D. (1995). Tests of fit for polytomous Rasch models. In G.H. Fischer & I.W. Molenaar (Eds.), Rasch models: Foundations, recent developments, and applications (pp. 325–352). New York: Springer-Verlag.
Hambleton, R.K., & Rogers, H.J. (1989). Detecting potentially biased test items: Comparison of IRT area and Mantel-Haenszel methods. Applied Measurement in Education, 2, 313–334.
Holland, P.W., & Thayer, D.T. (1988). Differential item functioning and the Mantel-Haenszel procedure. In H. Wainer & H.I. Braun (Eds.), Test validity (pp. 129–145). Hillsdale, NJ: Erlbaum.
Holland, P.W., & Wainer, H. (1993). Differential item functioning. Hillsdale, NJ: Erlbaum.
Jansen, M.G.H., & Glas, C.A.W. (2001). Statistical tests for differential test functioning in Rasch’s model for speed tests. In A. Boomsma, M.A.J. van Duijn, & T.A.B. Snijders (Eds.), Essays on item response theory (pp. 149–162). New York: Springer-Verlag.
Kelderman, H. (1989). Item bias detection using loglinear IRT. Psychometrika, 54, 681–697.
Kok, F.G., Mellenbergh, G.J., & van der Flier, H. (1985). Detecting experimentally induced item bias using the iterative logit method. Journal of Educational Measurement, 22, 295–303.
Lord, F.M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Erlbaum.
Louis, T.A. (1982). Finding the observed information matrix when using the EM algorithm. Journal of the Royal Statistical Society, Series B, 44, 226–233.
Mantel, N., & Haenszel, W. (1959). Statistical aspects of the analysis of data from retrospective studies of disease. Journal of the National Cancer Institute, 22, 719–748.
McCullagh, P., & Neider, J. (1989). Generalized linear models (2nd ed.). London: Chapman and Hall.
Meijer, R.R., & Van Krimpen-Stoop, E.M.L.A. (2001). Person fit across subgroups: An achievement testing example. In A. Boomsma, M.A.J. van Duijn, & T.A.B. Snijders (Eds.), Essays on item response theory (pp. 377–390). New York: Springer-Verlag.
Meredith, W., & Millsap, R.E. (1992). On the misuse of manifest variables in the detection of measurement bias. Psychometrika, 57, 289–311.
Mislevy, R.J. (1984). Estimating latent distributions. Psychometrika, 49, 359–381.
Mislevy, R.J. (1986). Bayes modal estimation in item response models. Psychometrika, 51, 177–195.
Muraki, E., & Bock, R.D. (1991). PARSCALE: Parameter scaling of rating data [Computer software]. Chicago: Scientific Software.
Rao, C.R. (1947). Large sample tests of statistical hypotheses concerning several parameters with applications to problems of estimation. Proceedings of the Cambridge Philosophical Society, 44, 50–57.
Rao, C. R. (1973). Linear statistical inference and its applications. New York: Wiley.
Rigdon, S.E., & Tsutakawa, R.K. (1983). Parameter estimation in latent trait models. Psychometrika, 48, 567–574.
Rogers, H.J., Swaminathan, H., & Egan, K. (1999, April). A multi-level approach for investigating differential item functioning. Paper presented at the Annual Meeting of the NCME, Montreal.
Swaminathan, H., & Rogers, H.J. (1990). Detecting differential item functioning using logistic regression procedures. Journal of Educational Measurement, 27, 361–370.
Thissen, D., Steinberg, L., & Wainer, H. (1993). Detection of differential item functioning using the parameters of IRT models. In P.W. Holland & H. Wainer (Eds.), Differential item functioning (pp. 67–113). Hillsdale, NJ: Erlbaum.
Wang, W.-C. (in press). Modeling effects of differential item functioning in polytomous items. Journal of Outcome Measurement.
Zieky, M. (1993). Practical questions in the use of DIF statistics in test development. In P.W. Holland & H. Wainer (Eds.), Differential item functioning (pp. 337–347). Hillsdale, NJ: Erlbaum.
Zimowski, M.F., Muraki, E., Mislevy, R.J., & Bock, R.D. (1996). BILOGMG: Multiple-group IRT analysis and test maintenance for binary items [Computer software]. Chicago: Scientific Software.
Zwinderman, A.H. (1991). A generalized Rasch model for manifest predictors. Psychometrika, 56, 589–600.
Zwinderman, A.H. (1997). Response models with manifest predictors. In W.J. van der Linden & R.K. Hambleton (Eds.), Handbook of modern item response theory (pp. 433–448). New York: Springer-Verlag.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer Science+Business Media New York
About this chapter
Cite this chapter
Glas, C.A.W. (2001). Differential Item Functioning Depending on General Covariates. In: Boomsma, A., van Duijn, M.A.J., Snijders, T.A.B. (eds) Essays on Item Response Theory. Lecture Notes in Statistics, vol 157. Springer, New York, NY. https://doi.org/10.1007/978-1-4613-0169-1_7
Download citation
DOI: https://doi.org/10.1007/978-1-4613-0169-1_7
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-95147-8
Online ISBN: 978-1-4613-0169-1
eBook Packages: Springer Book Archive