COMPUTER-AIDED TESTING: ASSESSMENT OF AUTOMATIC ITEM GENERATION TO CREATE MULTIPLE CHOICE TEST ITEMS
Computer-aided testing is an alternative to traditional paper-and-pen tests. Despite the fact that computer-based examination methods are well-established in academic centres, continuous monitoring of their quality is still needed.
In the academic year 2013/14, doctoral degree students of the Medical Faculty at Medical University of Warsaw (MUW) took part in the "Reliability in research" course for the first time conducted in the form of blended-learning. To be awarded a credit for the course, the students were supposed to pass a final test of the e-learning course that was made available on the Moodle platform.
Aim of study:
Evaluation of usefulness of automatically generated computer-interactive multiple choice test for assessment of achievements of blended-learning students.
Materials and Methods:
A total of 96 PhD students, including 45 first year (25 physicians) and 51 second year (24 physicians) students of a doctoral degree course. It was assumed that both groups of students are comparable and may constitute control groups for each other. The e-learning test results comprised a total of 43 multiple-choice questions (four options to choose from) and were subgrouped into the following categories:
(1) ethical aspects of scientific unreliability,
(2) scientific misconduct,
(3) copyright and research activity,
(4) conflict of interest in research,
(5) rules of Good Research Practice.
A test set was generated individually for each student out of all questions from the database.
The easiness of particular versions and the frequency of using particular questions from the entire pool of questions as well as within the thematic areas were compared to assess the quality of particular question sets. The significance of differences in results was assessed and the mean time necessary for completing the test in both groups was evaluated. Non-parametric Mann-Whitney U test was used for analysis. The significance level for all analyses was established at P < 0.05.
The questions included in automatically generated tests reflected the proportion of questions within the thematic subgroups. Deviation from representativity within the fields was not larger than 1.5% and frequency of using test questions ranged between 1.35 and 3.13% (mean: 2.33% ± 0.45). Total test easiness was high and amounted to 0.854 (0.755 – 1.000), and both groups of students did not differ significantly with respect to this (P > 0.05). No significant differences were also found with respect to the time spent on completing the test by first and second year students (460.9 s ± 124.260 versus 436.9 s ± 135.974, P > 0.05).
Reliable computer-based examination methods are used to meet the requirement of uniform rules and criteria of assessment of students' achievements. Automatic generation of question sets with the use of the Moodle platform tools may ensure fair and unbiased assessment of educational progress. The quality of computer-aided testing is comparable to that of traditional paper-and-pen test with reference to assessing the achievement of selected outcomes of education.