DIGITAL LIBRARY
EXAMINING THE USE OF SELF-ASSESSMENTS IN MOOCS THROUGH A/B TESTING
Georgetown University (UNITED STATES)
About this paper:
Appears in: INTED2017 Proceedings
Publication year: 2017
Pages: 7931-7936
ISBN: 978-84-617-8491-2
ISSN: 2340-1079
doi: 10.21125/inted.2017.1865
Conference name: 11th International Technology, Education and Development Conference
Dates: 6-8 March, 2017
Location: Valencia, Spain
Abstract:
The Terrorism and Counterterrorism Massive Open Online Course (MOOC) offered by GeorgetownX is now in its third iteration. One of the assessment strategies used in the course is open response self-assessments. MOOC participants are prompted to respond to open-ended questions, are then provided with a rubric and asked to compare their response to a model response from the instructor. The rubric includes the following criteria: "I missed most of the key points in the answer and I need to review the material"=0 points; "I missed one of the key points in the answer and I need to review parts of the material"=1 point; "I didn't miss any of the key points and I have a solid understanding of this material"=2 points.

In the first iteration of the Terrorism and Counterterrorism MOOC, a preliminary analysis of the submitted self-assessments revealed that responses were not being appropriately self-scored by students. Students took advantage of the self-scoring method by putting minimal effort in answering the questions and then scoring themselves with full credit. Of the 8,447 self-assessment responses submitted throughout the course, 2,720 were submitted as blank. Of those blank responses, 2,107 (77.4%) were self-scored as a 2 out of 2.

In the second iteration of the course, A/B testing was set up to answer the following research question: Will an additional prompt affect students’ self-assessment scores by deterring them from scoring blank responses with full credit? Students were randomly assigned to two groups: Group A received the same instructions from the first iteration, plus an additional prompt indicating that the course team would potentially review individual submissions; Group B only received the instructions from the first iteration with no additional prompt. Group A had N=646 generating a total of 6,312 responses while Group B had N=602 generating 5,955 responses.

Following the course, the self-assessment responses for each question were compiled, divided according to group membership, and evaluated by their content (or lack thereof). The responses were considered blank if they were: empty, nonsensical (an assortment of numbers), one word answers, irrelevant, unexplanatory (Negative, yes, no), phrases that show a lack of effort or understanding for the question ("I don’t know"), short opinion statements ("It is terrible"), website links, responses which only repeat the question, and other generic short responses that do not attempt to answer the prompt.

Group A had 121 blank responses out of the 6,312 with a mean score of 1.31 while Group B had 252 with a mean of 1.82. Pearson's Chi Square was used to compare the distribution of the scores to test whether the membership of the group had an effect on how students scored their blank responses. The scores are statistically dependent upon the group membership of the student (p-value = 2.664e-10). Given that students were randomly assigned to the groups, it can be assumed that the additional warning prompt given to Group A caused this difference between group scores.

In conclusion, it seems that the additional warning prompt added to self-assessment instructions did influence students’ decisions in scoring their responses according to the rubric. This result raises questions in relation to course design such as how to structure and grade self-assessments. It also raises questions about the ethical behavior of students.
Keywords:
MOOC, self-assessment, course design.