DIGITAL LIBRARY
ASSESSING INTER-RATER AGREEMENT ABOUT ITEM-WRITING FLAWS IN MULTIPLE-CHOICE QUESTIONS OF CLINICAL ANATOMY
Faculdade de Medicina da Universidade do Porto (PORTUGAL)
About this paper:
Appears in: EDULEARN13 Proceedings
Publication year: 2013
Pages: 5921-5924
ISBN: 978-84-616-3822-2
ISSN: 2340-1117
Conference name: 5th International Conference on Education and New Learning Technologies
Dates: 1-3 July, 2013
Location: Barcelona, Spain
Abstract:
Multiple-choice questions (MCQs) are regularly used in exams in order to assess students in health science disciplines. Despite this fact, MCQ items often have item-writing flaws, and few educators have formal instruction in writing MCQs [1].

The major purpose of our study was to estimate the inter-rater agreement about item classification as either standard or flawed.

In order to achieve this goal, four judges (2 teacher/2 students), blinded to all item performance data, independently classified each one of 920 test items from 10 examinations as either standard or flawed. If flawed the exact type of item flaw or flaws present in the question stem and respective options was recorded.

In this study, the standard item was operationally defined as any item that did not violate one or more of the 31 principles noted in a review article [2] which summarized current educational measurement recommendations concerning item writing.
The Fleiss' Kappa was use to evaluate the inter-rater agreement between 4 judges previous the consensus process.

In respect to the agreement about item classification as either standard or flawed was fair (kappa=0.3).

Despite the agreement was substantial for the more prevalent principles, generally the results showed many disagreements among judges about item classification, previous the consensus process. In a future investigation it is important to assess if presence of flaw or flaws in the MCQ item have impact its quality, namely, if there are interferences with difficulty and discrimination indices of the MCQ item.

References:
[1] Tarrant, M. and Ware, J. (2008), Impact of item writing flaws in multiple choice questions on student achievement in high stakes nursing assessments, Medical Education, 42 (2), pp. 198-206.
[2] Haladyna, T. M., Downing, S. M., and Rodriguez, M. C. (2002), A review of multiple-choice item-writing guidelines for classroom assessment, Applied Measurement in Education, 15 (3), pp. 309-333.

Supported by IJUP Controlo de Qualidade em Educação Médica: prevalência, determinantes e recomendações na meta-avaliação de testes de escolha múltipla em Anatomia (Ref. PP_IJUP2011 67).
Keywords:
Multiple-choice questions, meta-evaluation, clinical anatomy, flaws.