AN EMPIRICAL EVALUATION OF DIFFERENT TECHNIQUES FOR STUDENTS’ PEER ASSESSMENT
The implantation of the European Higher Education Area promotes the adoption of more active learning techniques that contribute to develop students’ lifelong learning skills. A requirement for lifelong learning is self-assessment or peer-assessment. This is, students should be able to assess, once they have pass the course, and without the supervision of a teacher, the quality of a piece of work, identifying strengths and weaknesses. Until now, assessment processes have been more likely to be carried by teacher in isolation, without involving the students. Thus, students did not know how this assessment process was carried out. If someone does not know how a process is performed, it will be quite difficult, or simply not feasible, he or she can repeat such a process. Therefore, it becomes mandatory to involve students in these assessment processes to promote the acquisition of competences such as analysis, criticism and self-criticism.
Nevertheless, if we want students able to carry out assessment processes properly, several issues must be taken into account. For instance, students do not behave as fair raters. We have observed they usually mark higher their close friend’s work, independently of the work quality. In general, students tend to give high marks to their mates. Moreover, from a statistical point of view, as the mean and the deviation of the distribution of marks per each work reveals we cannot rely on these data to extract conclusions about the quality of the work assessed. In general, the average rating for each work is approximately the same as the other one. In addition, the standard deviation for each work is high, which reflects variations in the assessment process that cannot be easily neglected.
To mitigate these effects, we have tried three techniques for students’ peer assessment. Furthermore, we have evaluated the performance of each alternative by using it in a Software Engineering course. We designed four experiments that were carried out using our students as subjects. Finally, we analysed the gathered data in order to analyse strengths and weaknesses of each approach.
Therefore, this paper describes the alternatives we have analysed, the results obtained after using these techniques in a real scenario, the statistical analysis of these results and a discussion of the strengths and weaknesses of each approach.