G. Montalvo, D. Marín

University of Alcalá (SPAIN)
The same database of multiple-choice questions was applied to subjects of different university degrees (chemistry and pharmacy) containing the module "Thermodynamics" in their syllabus. These questions were used to perform several on-line assessments.

The targets established were: i) checking the correct formulation and understandability of each question (item); ii) verifying the quality of each test; iii) comparing the quality of the same question when applied to various groups of students from two different degree courses; and, finally, iv) taking corrective action, including reformulating the questions or redesigning the test as a whole, when required.

Subsequent statistical evaluation of the empirical evidence collected showed that all items met the desired psychometric requirements. This analysis of the quality of the test was performed in terms of discrimination (D), difficulty (p) and the relationship between them.

Psychometric analysis has been shown to be able to detect errors in the formulation of items, thereby potentially improving the test as a whole. The fact that difficulty is more significant than discrimination when it comes to detecting errors in the items is probably due to the fact that the statistical techniques used are not valid when applied to small groups of students. However, the relationship between p and D is related to the need for an improvement action for an item. As a result, we propose basic guidelines that should be taken into account when reformulating such items.