Değişen Madde Fonksiyonu Belirlemede MTK-Olabilirlik Oranı, Ordinal Lojistik Regresyon ve Poly-Sıbtest Yöntemlerini/Comparison of Likelihood Ratio Test (LRT), Poly-Sıbtest and Logistic Regression in Differential Item Functioning (DIF) Detection Procedures

Arş.Grv. Çiğdem Akın Arıkan
3.186 636

Abstract


Özet

Bu çalışmada PISA 2012 öğrenci anketinde yer alan Matematik Çalışma Etiği anketindeki tutum maddelerinin cinsiyete göre değişen madde fonksiyonu gösterip göstermediği incelenmiştir. DMF analizleri Madde Tepki Kuramı-Olabilirlik Oranı (MTK-OO), Ordinal Lojistik regresyon (OLR) ve Poly-SIBTEST yöntemleriyle gerçekleştirilmiştir. Çalışmanın örneklemini, matematik çalışma etiği tutum maddelerine yanıt veren 3217 bireyden SPSS programı aracılığıyla random olarak seçilen 1000 (500 kadın, 500 erkek) birey oluşturmaktadır. Verilerle ilgili madde analizleri MTK ile yapılmıştır. Cinsiyete göre DMF analizlerinde Poly-SIBTEST yöntemine göre 3. Maddenin yüksek düzeyde erkekler lehine, MTK-OO yöntemine göre 3. Maddenin orta düzeyde ve OLR yöntemine göre 3. Maddenin yüksek düzeyde çok biçimli DMF gösterdiği görülmüştür.
Anahtar Sözcükler: PISA, Değişen Madde Fonksiyonu, MTK-OO, Poly-SIBTEST, OLR

Extended Abstract

Problem and Purpose: PISA, which are implemented in many countries, identify strengths and weaknesses of education systems of such countries and pave the way for developing future-oriented policies for education systems. For this reason, PISA results need to be able to enlighten differences among individuals and include minimum error. In other terms, reliability and validity evidences regarding measuring results must be presented. One important threat against validity is item and test bias (Clauser and Mazor, 1998). The individuals at the same level of ability are normally expected to get the same score from the test or items in a test. However, probability of correct answers in subgroups of the same ability level might differ due to conditions of the test or some attributes of the item. This is called differential item functioning (DIF) (Zumbo, 2007; Finch and French, 2007). DIF refers to systematic differentiation of individuals in a group with the same test score and ability level in terms of probability of answering a certain test item (Doğan, Guerrero and Tatsuoka, 2005). Regarding items of attitude, Hulin, Drasgow and Parsons (1983) define DIF as differing probability of showing positive attitude towards a certain item by individuals in different subgroups with the same ability level (cited by Johanson and Dodeen, 2003). In attitude scales, existence of DIF affects accuracy of measuring results in a negative way. The study aims at identify whether or not items in PISA 2012 maths work ethics questionnaire differ by gender was examined with poly-SIBTEST, MTK-OO and OLR methods. Also it was investigated whether items showing DIF change depending on methods.

Method: The responses given for attitude items in maths work ethics questionnaire of PISA 2012 student questionnaire were downloaded from the OECD web site (http://pisa2012.acer.edu.au/) in January 2014. In Turkey sample of PISA 2012 application, 3217 15-year old students responded to the maths work ethics attitude items. After removing missing data of 3217 participants, 1000 people were selected among 3130 individuals by using SPSS on a random basis (500 females and 500 males). 9 items in the questionnaire were checked to see whether they show DIF by gender. SIBTEST software was used for Poly-SIBTEST method, the macro developed by Zumbo (1999) was used for OLR method, and the IRTLRDIF software was used for MTK-OO. During data analysis, males were used as focus group and females as reference group.

Conclusion and Recommendations: According to the DIF analysis results, item 3 shows non-uniform DIF at high level, medium and high level as a result of Poly-SIBTEST, MTK-OO and OLR analysis, respectively. While MTK-OO and poly-SIBTEST methods determine uniform DIF, OLR is able to determine both uniform and non-uniform DIF. Therefore, OLR indicates more details about nature of DIF. Obtained results show that although used methods produce similar results for items with DIF, DIF amounts are found to be different. This difference is thought to be due to the difference of value intervals belonging to classifications regarding DIF levels. Asil (2012), point out that DIF in PISA items often arises from translation and adaptation errors. In relation with item bias, Van de Vijver and Tanzer (2003), and Hambleton, Merenda and Spielberger (2005) argue that DIF arises especially in questionnaires used in international comparisons due to reasons such as insufficient translation, mistyping of items and ambiguous translation of the item.
Keywords: PISA, Likelihood Ratio Test (LRT), Poly-Sıbtest, Logistic Regression, Differential Item Functioning (DIF)

Full Text:

PDF(611)


DOI: http://dx.doi.org/10.19160/e-ijer.24504

References


Asil, M. (2010). Uluslar arası Öğrenci Değerlendirme Programı (PISA) 2006 Öğrenci Anketinin Kültürler Arası Eşdeğerliğinin İncelenmesi. Yayınlanmamış Doktora Tezi, Ankara: Hacettepe Üniversitesi Sosyal Bilimler Enstitüsü.

Ayan, C. (2011). PISA 2009 Fen Okuryazarlığı Alt Testinin Değişen Madde Fonksiyonu Açısından İncelenmesi, Yayınlanmamış Yüksek Lisans Tezi, Ankara: Hacettepe Üniversitesi Sosyal Bilimler Enstitüsü.

Atar, B. ve Kamata, A. (2011). MTK Olabilirlik Oranı testi ve Lojistik regresyon Değişen madde Fonksiyonu Belirleme Yöntemlerinin Karşılaştırılması. Hacettepe Üniversitesi Eğitim Fakültesi Dergisi 41: 36-47

Bakan Kalaycıoğlu, D. ve Kelecioğlu, H. (2011). Öğrenci Seçme Sınavı'nın madde yanlılığı açısından incelenmesi. Eğitim ve Bilim, 36 (161), 3-12.

Camilli, G. (2006). Test fairness. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 221- 256). Westport: American Council on Education&Praeger Publishers.

Clauser, B. E.,& Mazor, K. M. (1998). Using statistical procedure to identify differential item functioning test items. Educational Measurement: Issues and Practice, 17, 31-44.

Costa, P.D. & Araşjo L. (2012). Differential Item Functioning (DIF): What Functions Differently for Immigrant Students in PISA 2009 Reading Items? European Commission Joint Research Centre. İtalya.

Doğan, E., Guerrero, A. & Tatsuoka, K. (2005). Using DIF toinvestigatestrenght ad weakness in mathematicsachievement profiles of 10 differentcountries. The annual meeting of the National Council on Measurement in Education-NCME (2005,12-14 April). Montreal, Canada.

Embretson, S. E.,& Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ:Lawrence Erlbaum Associates, Inc.

Eğitim Reformu Girişimi (2010). Pısa 2009 Sonuçlarına İlişkin Değerlendirme. (http://erg.sabanciuniv.edu/sites/erg.sabanciuniv.edu/files/PISA2009DegerlendirmeNotu_Fi nal_08022010.pdf, Erişim Tarihi: 06.05.2014).

Finch, W. H. ve French. F.B. (2007). Detection of Crossing Differential Item Functioning A Comparison of Four Methods. Applied Psychological Measurement. Educational and Psychological Measurement67.4.

Gök, B., Atalay Kabasakal, K. ve Kelecioğlu, H. (2015). PISA 2009 Öğrenci Anketi Tutum Maddelerinin Kültüre Göre Değişen Madde Fonksiyonu Açısından İncelenmesi. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, 5/1, Yaz 2014, 72-87.

Hambleton, R. K. & De Jong, J.H.A.L. (2003). Advances in translating and adapting educational and psychological tests. Language Testing, 20(2).

Hambleton, R. K & Swaminathan, H. (1985). Item response theory: Principles and applications. Norwell, MA:kluwer Academic Publishers.

Hambleton, R. K., Merenda, P. F. ve Spielberger, C. D. (eds.) (2005). Adapting educational and psychological tests for cross-cultural assessment. Mahwah, NJ Lawrence Erlbaum.

Huang, X. (2010). Differential Item Functioning: The Consequence of Language, Curriculum, or Culture?, Unpublished PhD dissertation, University of California, Berkeley: USA

Johanson, G. A., & Dodeen, H. (2003). An Analysis of Sex-related Differential Item Functioning in Attitude Assessment. Assesment &Evaluation in Higher Education, 28.

Le, L.T. (2009). Investigation gender differential item functioning across countries

ABD test languages for PISA science items. International Journal of Testing. 9(2), 122–133.

MEB (2013). PISA 2012 Uluslararası Öğrenci Değerlendirme Programı Ulusal Ön Raporu. MEB, Ankara.

Öğretmen, T. (1995). Differential item functioning analysis of the verbal ability section of the first stage of the university entrance examination in Turkey. Yayımlanmamış yüksek lisans tezi, Orta Doğu Teknik Üniversitesi.

Özdemir, D. (2003). Çoktan seçmeli testlerde iki kategorili ve önsel ağırlıklı puanlamanın diferansiyel madde fonksiyonuna etkisi ile ilgili bir araştırma. Eğitim ve Bilim. 25; 37–44.

Stout, W. F.,&Roussos, L. A. (1995). SIBTEST Users Manual, (2nd ed.). Unpublished manuscript, University of Illinois at Urbana-Champaign, Illinois.

Thissen, D. (2001). IRTLRDIF (Version 2.0b) [Computersoftware]. Chapel Hill: L. L. Thurstone Psychometric Laboratory, University of North Carolina.

Van de Vijver, F. J. R., & Tanzer, N. K. (2004). Bias and Equivalence in Cross-Cultural Assessment. European Review of Applied Psychology, 54, 119-135.

Zumbo, B. D. (1999). A Handbook on theTheory and Methods of Differential Item

Functioning (DIF): Logistic Regression Modeling as a Unitary Framework for Binary and Likert-type (Ordinal) ItemScores. Ottawa ON: Directorate of Human Resources Researchand Evaluation, Department of National Defense.

Zumbo, B. D. (2007). Three generations of DIF analyses: Considering where it has been, Where it is now, and Where It Is Going. Language Assessment Quarterly, 4(2), 223–233.