Skip to main content
Log in

Sample Size Determination Within the Scope of Conditional Maximum Likelihood Estimation with Special Focus on Testing the Rasch Model

  • Published:
Psychometrika Aims and scope Submit manuscript

Abstract

This paper refers to the exponential family of probability distributions and the conditional maximum likelihood (CML) theory. It is concerned with the determination of the sample size for three groups of tests of linear hypotheses, known as the fundamental trinity of Wald, score, and likelihood ratio tests. The main practical purpose refers to the special case of tests of the class of Rasch models. The theoretical background is discussed and the formal framework for sample size calculations is provided, given a predetermined deviation from the model to be tested and the probabilities of the errors of the first and second kinds.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Agresti, A. (2002). Categorical data analysis (2nd ed.). New York: Wiley.

    Book  Google Scholar 

  • Aitchison, J., & Silvey, S. D. (1958). Maximum likelihood estimation of parameters subject to restraints. The Annals of Mathematical Statistics, 29, 813–828.

    Article  Google Scholar 

  • Andersen, E. B. (1970). Asymptotic properties of conditional maximum likelihood estimators. Journal of the Royal Statistical Society, Series B, 32, 283–301.

    Google Scholar 

  • Andersen, E. B. (1973). A goodness of fit test for the Rasch model. Psychometrika, 38, 123–140.

    Article  Google Scholar 

  • Andersen, E. B. (1977). Sufficient statistics and latent trait models. Psychometrika, 42, 69–81.

    Article  Google Scholar 

  • Andersen, E. B. (1980). Discrete statistical models with social science applications. Amsterdam: North-Holland.

    Google Scholar 

  • Andrich, D. (1978). A rating formulation for ordered response categories. Psychometrika, 43, 561–573.

    Article  Google Scholar 

  • Bahadur, R. R. (1954). Sufficiency and statistical decision functions. Annals of the Institute of Statistical Mathematics, 25, 423–462.

    Article  Google Scholar 

  • Barndorff-Nielsen, O. (1978). Information and exponential families in statistical theory. New York: Wiley.

    Google Scholar 

  • Cohen, J. (1988). Statistical power analyses for the behavioral sciences. New York: Erlbaum.

    Google Scholar 

  • Davidson, R. R., & Lever, E. L. (1967). The limiting distribution of the likelihood ratio statistic under a class of local alternatives. Florida State University Statistics Report M126, Tallahassee.

  • Diamond, E. L. (1963). The limiting power of categorical data chi-square tests analogous to normal analysis of variance. Annals of Mathematical Statistics, 34, 1432–1441.

    Article  Google Scholar 

  • Draxler, C. (2010). Sample size determination for Rasch model tests. Psychometrika, 75, 708–724.

    Article  Google Scholar 

  • Dynkin, E. B. (1951). Necessary and sufficient statistics for a family of probability distributions. Uspekhi Matematicheskikh Nauk, 6, 68–90.

    Google Scholar 

  • Feder, P. I. (1968). On the distribution of the log likelihood ratio test statistic when the true parameter is near the boundaries of the hypothesis regions. The Annals of Mathematical Statistics, 39, 2044–2055.

    Article  Google Scholar 

  • Fischer, G. H. (1981). On the existence and uniqueness of maximum-likelihood estimates in the Rasch model. Psychometrika, 46, 59–77.

    Article  Google Scholar 

  • Fischer, G. H., & Molenaar, I. W. (1995). Rasch models-foundations, recent developments and applications. New York: Springer.

    Google Scholar 

  • Fleiss, J. L. (1981). Statistical methods for rates and proportions (2nd ed.). New York: Wiley.

    Google Scholar 

  • Gaffke, N., Steyer, R., & von Davier, A. A. (1999). On the asymptotic null-distribution of the Wald statistic at singular parameter points. Statistics & Decisions, 17, 339–358.

    Google Scholar 

  • Glas, C. A. W. (1988). The derivation of some tests for the Rasch model from the multinomial distribution. Psychometrika, 53, 525–546.

    Article  Google Scholar 

  • Glas, C. A. W., & Verhelst, N. D. (1989). Extensions of the partial credit model. Psychometrika, 54, 635–659.

    Article  Google Scholar 

  • Glas, C. A. W., & Verhelst, N. D. (1995a). Testing the Rasch model. In G. H. Fischer & I. W. Molenaar (Eds.), Rasch models-foundations, recent developments and applications (pp. 69–95). New York: Springer.

    Google Scholar 

  • Glas, C. A. W., & Verhelst, N. D. (1995b). Tests of fit for polytomous Rasch models. In G. H. Fischer & I. W. Molenaar (Eds.), Rasch models- foundations, recent developments and applications (pp. 325–352). New York: Springer.

    Google Scholar 

  • Glas, C. A. W. (2006). Testing generalized Rasch models. In M. von Davier & C. H. Carstensen (Eds.), Multivariate and mixture distribution rasch models: Extensions and applications (pp. 37–56). New York: Springer.

    Google Scholar 

  • Haberman, S. J. (1974). The analysis of frequency data. Chicago: University of Chicago Press.

    Google Scholar 

  • Haberman, S. J. (1981). Tests for independence in two-way contingency tables based on canonical correlation and on linear-by-linear interaction. The Annals of Statistics, 9, 1178–1186.

    Article  Google Scholar 

  • Kelderman, H. (1984). Log linear Rasch model tests. Psychometrika, 49, 223–245.

    Article  Google Scholar 

  • Kelderman, H. (1989). Item bias detection using log linear IRT. Psychometrika, 54, 681–697.

    Article  Google Scholar 

  • Martin- Löf, P. (1973). Statistiska Modeller. (Statistical models. Notes from seminars 1969–1970 by Rolf Sundberg.) Stockholm.

  • Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149–174.

    Article  Google Scholar 

  • Maydeu-Olivares, A., & Montano, R. (2013). How should we assess the fit of Rasch-type models? Approximating the power of goodness-of-fit statistics in categorical data analysis. Psychometrika, 78, 116–133.

    Article  PubMed  Google Scholar 

  • Mitra, S. K. (1958). On the limiting power function of the frequency chi-square test. Annals of Statistics, 29, 1221–1233.

    Article  Google Scholar 

  • Müller, H. (1987). A Rasch model for continuous ratings. Psychometrika, 52, 165–181.

    Article  Google Scholar 

  • Neyman, J., & Pearson, E. S. (1928). On the use and interpretation of certain test criteria for purposes of statistical inference. Biometrika, 20 A, 263–294.

  • Neyman, J., & Scott, E. L. (1948). Consistent estimates based on partially consistent observations. Econometrica, 16, 1–32.

    Article  Google Scholar 

  • Pearson, K. (1900). On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Philosophical Magazine Series, 5(50), 157–175.

    Article  Google Scholar 

  • Pfanzagl, J. (1993). On the consistency of conditional maximum likelihood estimators. Annals of the Institute of Statistical Mathematics, 45, 703–719.

    Article  Google Scholar 

  • Rao, C. R. (1948). Large sample tests of statistical hypotheses concerning several parameters with applications to problems of estimation. Proceedings of the Cambridge Philosophical Society, 44, 50–57.

    Article  Google Scholar 

  • Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Copenhagen: The Danish Institute of Education Research (Expanded Edition, 1980. Chicago: University of Chicago Press).

  • Rasch, G. (1961). On general laws and the meaning of measurement in psychology. Berkeley: University of California Press.

    Google Scholar 

  • Satorra, A., & Saris, W. E. (1985). The power of the likelihood ratio test in covariance structure analysis. Psychometrika, 50, 83–90.

    Article  Google Scholar 

  • Silvey, S. D. (1959). The Lagrangian multiplier test. The Annals of Mathematical Statistics, 30, 389–407.

    Article  Google Scholar 

  • Stroud, T. W. F. (1972). Fixed alternatives and Wald’s formulation of the noncentral asymptotic behavior of the likelihood ratio statistic. The Annals of Mathematical Statistics, 43, 447–454.

    Article  Google Scholar 

  • van den Wollenberg, A. (1982). Two new test statistics for the Rasch model. Psychometrika, 47, 123–140.

    Article  Google Scholar 

  • Verhelst, N. D., & Glas, A. W. (1995). The one parameter logistic model. In G. H. Fischer & I. W. Molenaar (Eds.), Rasch models- foundations, recent developments and applications (pp. 215–237). New York: Springer.

    Google Scholar 

  • Verhelst, N. D., Glas, C. A. W., & Verstralen, H. H. F. M. (1994). OPLM: Computer program and manual. Arnhem: CITO.

    Google Scholar 

  • von Davier, A. A. (2003). Large sample tests for comparing regression coefficients in models with normally distributed variables. Research Report RR-03-29. Princeton, NJ: Educational Testing Service.

  • Wald, A. (1943). Test of statistical hypotheses concerning several parameters when the number of observations is large. Transactions of the American Mathematical Society, 54, 426–482.

    Article  Google Scholar 

  • Wilks, S. S. (1938). The large sample distribution of the likelihood ratio for testing composite hypotheses. The Annals of Mathematical Statistics, 9, 60–62.

    Article  Google Scholar 

  • Wilson, M., & Masters, G. N. (1993). The partial credit model and null categories. Psychometrika, 58, 87–99.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Clemens Draxler.

Appendix

Appendix

This appendix provides a discussion of the fundamentals of the asymptotic properties of the CML estimator with reference to the concept of a sequence of local alternative hypotheses and a fixed alternative. It refers to the result that a joint limiting distribution for Wald, score, and likelihood ratio test statistics does not exist under a fixed alterative whereas it does exist under a sequence of local alternative hypotheses.

Let \(\varvec{\beta }_{0} \) denote a vector satisfying (4) and assume that \(\varvec{\beta }=\varvec{\beta }_{0} \) holds. Then, according to Andersen’s (Andersen, 1970) results,

$$\begin{aligned} \sqrt{n} \left( {\hat{\varvec{{\beta }}}-\varvec{\beta }_{0}} \right) \xrightarrow {d}N\left[ {\varvec{0},n\varvec{{I}}\left( {\varvec{\beta }_{0}} \right) ^{-1}} \right] , \end{aligned}$$

for \(n\rightarrow \infty \), the Fisher information matrix \(\varvec{{I}}\left( {\varvec{\beta }_{0}} \right) \) being evaluated using \(\varvec{\beta }_{0} \). The covariance \(n\varvec{{I}}\left( {\varvec{\beta }_{0}} \right) ^{-1}\) of \(\sqrt{n} \left( {{\hat{\varvec{{\beta }}}-\varvec{\beta } }_{0}} \right) \) remains bounded as \(n\rightarrow \infty \) since the Fisher information is additive and may be rewritten as

$$\begin{aligned} \varvec{{I}}\left( {\varvec{\beta }_{0}} \right) = - E\left[ {\frac{{\partial ^{2} L\left( {\varvec{\beta }_{0}} \right) }}{{\partial \varvec{\beta }_{0} \partial \varvec{\beta } ^{\prime }_{0}}}} \right] = - \sum \limits _{{v = 1}}^{n} {E\left\{ {\frac{{\partial ^{2} \log \left[ {h_{{\varvec{\beta }_{0}}} \left( {\varvec{y}_{v} \left| {\varvec{T}_{v} = \varvec{t}_{v}} \right. } \right) } \right] }}{{\partial \varvec{\beta }_{0} \partial \varvec{\beta } ^{{\prime }}_{0}}}} \right\} }, \end{aligned}$$

i.e., the information contained in the conditional distribution of \(\varvec{Y}_{1} \left| {\varvec{T}_{1} =\varvec{t}_{1}} \right. ,\ldots ,\varvec{Y}_{n} \left| {\varvec{T}_{n} =\varvec{t}_{n}} \right. \) is obtained by summation of the information contained in the conditional distribution of each \(\varvec{Y}\left| {\varvec{T}=\varvec{t}} \right. \).

Let \(\varvec{\beta }_{a} =\varvec{\beta }_{0} +\varvec{\delta }, \varvec{\delta }\ne \varvec{0}\), be a vector not satisfying (4), i.e., a fixed alternative, a fixed deviation independent of n. If \(\varvec{\beta }=\varvec{\beta }_{a} \) holds, a limiting distribution of \(\sqrt{n} \left( {\hat{\varvec{{\beta }}}-\varvec{\beta }_{0}} \right) \) will not exist since, according to Andersen’s (1970) and Pfanzagl’s (1993) consistency theorems, \(\hat{\varvec{{\beta }}}\) will converge to \(\varvec{\beta }_{a} \) so that

$$\begin{aligned} \sqrt{n} \left( {\hat{\varvec{{\beta }}}-\varvec{\beta }_{0}} \right) =\sqrt{n} \left[ {\hat{\varvec{{\beta }}}-\left( {\varvec{\beta }_{a} \varvec{-\delta }} \right) } \right] \end{aligned}$$

will grow unboundedly with \(n\rightarrow \infty \), i.e., the mean vector amounts to \(\sqrt{n} \varvec{\delta }\). Hence, a limiting distribution for the three test statistics Wald, score, and likelihood ratio will also not exist if the fixed alternative \(\varvec{\beta }=\varvec{\beta }_{a} \) holds.

Let the assumption of a sequence of local alternative hypotheses \(\varvec{\beta }_{an} =\varvec{\beta }_{0} +\varvec{\delta }n^{-0.5}\) be introduced so that \(\varvec{T\beta }_{an} \rightarrow \varvec{c}\) as \(n\rightarrow \infty \). Then, one obtains

$$\begin{aligned} \sqrt{n} \left( {\hat{\varvec{{\beta }}}-\varvec{\beta }_{0}} \right) =\sqrt{n} \left[ {\hat{\varvec{{\beta }}}-\left( {\varvec{\beta }_{an} \varvec{-\delta }n^{-0.5}} \right) } \right] \xrightarrow {d}N\left[ {\varvec{\delta },n\varvec{{I}}\left( {\varvec{\beta }_{0}} \right) ^{-1}} \right] \end{aligned}$$

as \(n\rightarrow \infty \), since both \(\hat{\varvec{{\beta }}}\rightarrow \varvec{\beta }_{0} \) and \(\varvec{\beta }_{an} \rightarrow \varvec{\beta }_{0} \). With this and given technical regularity conditions, the results regarding the joint limiting distribution of Wald, score, and likelihood ratio test statistics derived in the papers cited in 3.2 apply. The non-central \(\chi ^{2}\) distribution obviously follows from the multivariate normal distribution of \(\sqrt{n} \left( {\hat{\varvec{{\beta }}}-\varvec{\beta }_{0}} \right) \) with mean vector \(\varvec{\delta }\ne \varvec{0}\). It might also be interesting to note that the statistic divided by n has a positive limit which may provide a measure of model error comparable to the well-known effect measures suggested by Cohen (1988).

In 4.1 it is assumed that the sequence of nuisance parameters \(\varvec{\theta }_{1} ,\ldots ,\varvec{\theta }_{n} \) is independently and identically distributed with joint probability density function \(\varphi \left( {\varvec{\theta }} \right) \) so that the common probability distribution of the sequence \(\varvec{T}_{1} ,\ldots ,\varvec{T}_{n} \) is given by (9). Under this assumption it holds for the asymptotic covariance matrix of \(\sqrt{n} \left( {\hat{\varvec{{\beta }}}-\varvec{\beta }_{0}} \right) \) that, according to (10),

$$\begin{aligned} n\varvec{{I}}\left( {\varvec{\beta }_{0}} \right) ^{-1}=nn^{-1}{\varvec{\varGamma }}\left( {\varvec{\beta }_{0}} \right) ^{-1}={\varvec{\varGamma }}\left( {\varvec{\beta }_{0}} \right) ^{-1}, \end{aligned}$$

with \({\varvec{\varGamma }}\left( {\varvec{\beta }_{0}} \right) \) given by the second factor (the integral) of (10) times \(-1\) evaluated at \(\varvec{\beta }_{0} \). Hence, for \(n\rightarrow \infty \), if \(\varvec{\beta }=\varvec{\beta }_{0} \) holds

$$\begin{aligned} \sqrt{n} \left( {\hat{\varvec{{\beta }}}-\varvec{\beta }_{0}} \right) \xrightarrow {d}N\left[ {\varvec{0},{\varvec{\varGamma }}\left( {\varvec{\beta }_{0}} \right) ^{-1}} \right] \end{aligned}$$

and under the sequence of local alternative hypotheses

$$\begin{aligned} \sqrt{n} \left( {\hat{\varvec{{\beta }}}-\varvec{\beta }_{0}} \right) \xrightarrow {d}N\left[ {\varvec{\delta },{\varvec{\varGamma }}\left( {\varvec{\beta }_{0}} \right) ^{-1}} \right] . \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Draxler, C., Alexandrowicz, R.W. Sample Size Determination Within the Scope of Conditional Maximum Likelihood Estimation with Special Focus on Testing the Rasch Model. Psychometrika 80, 897–919 (2015). https://doi.org/10.1007/s11336-015-9472-y

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11336-015-9472-y

Keywords

Navigation