Applying Item Response Theory Models to Entrance Examination for Graduate Studies: Practical Issues and Insights

2.534 636


Item response theory is a psychometric framework for the design, analysis, and scaling of standardized assessments, psychological instruments, and other measurement tools. Despite its increasing use in educational and psychological assessments across many countries around the world, it has not been applied to any large-scale assessment in Turkey. The purpose of this study is to investigate the fit of unidimensional item response theory models to the Entrance Examination for Graduate Studies which is a high-stake large-scale assessment in Turkey required for applying to graduate programs in Turkish universities. Model assumptions of item response modeling, such as unidimensionality, local independence, and measurement invariance, are examined. Also, model-specific assumptions, such as equal item discrimination and minimal guessing, are evaluated. Findings of this study suggest that the three-parameter IRT model shows the best model-data fit for the Entrance Examination for Graduate Studies. Also, the results of this study highlight potential issues that need to be addressed, such as high omit rates, speededness of the test, and aberrant guessing behaviors.  

Tam metin:

PDF (English)



Albenese, M. A., & Forsyth, R. A. (1984). The one-, two-, and modified two parameter latent trait models: An empirical study of relative fit. Educational and Psychological Measurement, 44(2), 229-246.

Berberoglu, G. (1990). Do the Rasch and three-parameter models produce similar results in test analyses? Journal of Human Sciences, 10, 7-16.

Bishop, Y. M. M., Fienberg, S. E., & Holland, P. W. (1975). Discrete multivariate analysis. Cambridge, MA: MIT Press.

Bulut, O., & Kan, A. (2012). Application of computerized adaptive testing to Entrance Examination for Graduate Studies in Turkey. Eurasian Journal of Educational Research, 49, 61-80.

Celik, D. (2001). The fit of the one-, two- and three-parameter models of item response theory (IRT) to the ministry of national education secondary education institutions student selection and placement test data. Unpublished master's thesis, Middle East Technical University, Ankara, Turkey.

Chen, W., & Thissen, D. (1997). Local dependence indexes for item pairs using item response theory. Journal of Educational and Behavioral Statistics, 22(3), 265-289.

Chernyshenko, O. S., Stark, S., Chan, K. Y.,, Drasgow, F., & Williams, B. (2001). Fitting item response theory models to two personality inventories: issues and insights. Multivariate Behavioral Research, 36(4), 523-562.

Choi, I. (1989). An application of item response theory to language testing: Model-data fit studies (Unpublished doctoral dissertation). University of Illinois at Urbana-Champaign.

Courville, T. G. (2005). An empirical comparison of item response theory and classical test theory item/person statistics (Unpublished doctoral dissertation). Texas A&M University.

Crocker, L., & Algina, J. (1986). Introduction to classical and modern test theory. New York: Holt, Rinehart and Winston, Inc.

Davis, L. L. (2002). Strategies for controlling item exposure in computerized adaptive testing with polytomously scored items (Unpublished doctoral dissertation). University of Texas at Austin.

De Ayala, R. J. (2008). The theory and practice of item response theory. New York, NY: Guilfords Publications.

Dorans, N. J., & Kingston, N. M. (1985). The effects of violations of unidimensionality on the estimation of item and ability parameters and on item response theory equating of the GRE verbal scale. Journal of Educational Measurement, 22(4), 249-262.

Drasgow, F., Levine M. V., Tsien, S., Williams B. A., & Mead, A. D. (1995). Fitting polytomous item response theory models to multiple-choice tests. Applied Psychological Measurement, 19, 143-165.

Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: Lawrence Erlbaum Associates.

Engelhard, G. (1991). Thorndike, Thurstone and Rasch: A comparison of their approaches to item-invariant measurement. Journal of Research and Development in Education, 24(2), 45-60.

Fan, X. (1998). Item response theory and classical test theory: An empirical comparison of their item/person parameters. Educational and Psychological Measurement, 58, 357–381.

Ferrara, S., Huynh, H., & Bagli, H. (1997). Contextual characteristics of locally dependent open-ended item clusters on a large-scale performance assessment. Applied Measurement in Education, 12, 123-144.

Fletcher, T. D. (2015). psychometric: applied psychometric theory. [Computer software]. Available from

Finch, W. H., & French, B. F. (2014). The impact of group pseudo-guessing parameter differences on the detecting of uniform and nonuniform DIF. Psychological Test and Assessment Modeling, 56(1), 25-44.

Gao, S. (2011). The exploration of the relationship between guessing and latent ability in IRT models. Dissertations. Paper 423.

Guyer, R., & Thompson, N.A., (2011). User’s Manual for Xcalibre 4.1. St. Paul MN: Assessment Systems Corporation.

Hambleton, R. K., & Jones, R. W. (1993). Comparison of classical test theory and item response. Educational Measurement: Issues and Practice, 12(3), 38-47.

Hambleton, R. K., & Rogers, J. H. (1990). Using item response models in educational assessments. In W. Schreiber, & K. Ingenkamp (Eds.), International developments in large-scale assessment (pp. 155-184). England: NFER-Nelson.

Hambleton, R. K., Swaminathan, H., & Rogers, J. H. (1991). Fundamentals of item response theory. New York: Sage publications.

Hu, L.T., & Bentler, P.M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6(1), 1-55.

Humphreys, L. G., & Montanelli, R. G. (1975). An investigation of the parallel analysis criterion for determining the number of common factors. Multivariate Behavioral Research, 10, 193-205.

Kilic, I. (1999). The fit of one, two and three parameter models of item response theory to the student selection test of the student selection and placement center. Unpublished master's thesis, Middle East Technical University, Ankara, Turkey.

Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Erlbaum.

McDonald, R. P. (1981). The dimensionality of test and items. British Journal of Mathematical and Statistical Psychology, 34, 100-117.

Muthén, L.K., & Muthén, B.O. (1998-2011). Mplus 6. Los Angeles, CA: Muthén and Muthén.

Önder, İ. (2007). Model veri uyumunun araştırılması. Hacettepe Üniversitesi Eğitim Fakültesi Dergisi, 32, 210-220.

Pelton, T. W. (2002). The accuracy of unidimensional measurement models in the presence of deviations for the underlying assumptions. Unpublished doctoral dissertation, Brigham Young University, Department of Instructional Psychology and Technology.

Progar, Š, Sočan, G., & Peč, M. (2008). An empirical comparison of item response theory and classical test theory. Horizons of Psychology, 17(3), 5-24.

R Core Team (2015). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing.

Reise, S., & Waller, N. (2002). Item response theory for dichotomous assessment data. In Drasgow, F. and Schmitt, N. (Eds.), Measuring and Analyzing Behavior in Organizations. San Francisco: Jossey-Bass.

Rupp, A. A., & Zumbo, B. D. (2006). Understanding parameter invariance in unidimensional IRT models. Educational and Psychological Measurement, 66(1), 63-84.

Stout, W. (1987). A nonparametric approach for assessing latent trait unidimensionality. Psychometrika, 52, 589-617.

Thorndike, E. L. (1904). An introduction to the theory of mental and social measurements. New York: Teacher's College.

Van den Wollenberg, A. L. (1982). Two new test statistics for the Rasch model. Psychometrika, 47, 123-140.

Wright, B. D. (1997). A history of social science measurement. Educational Measurement: Issues and Practice, 16(4), 33-45.

Yen, W. M. (1984). Effects of local item dependence on the fit and equating performance of the three-parameter logistic model. Applied Psychological Measurement, 8, 125-145.

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.