DIGITAL LIBRARY
SUMMATIVE EXAMS WITH THE USE OF CHATGPT: VISION OR REALISTIC ALTERNATIVE TO TRADITIONAL EXAMS?
Technical University of Munich (GERMANY)
About this paper:
Appears in: INTED2024 Proceedings
Publication year: 2024
Pages: 3980-3990
ISBN: 978-84-09-59215-9
ISSN: 2340-1079
doi: 10.21125/inted.2024.1026
Conference name: 18th International Technology, Education and Development Conference
Dates: 4-6 March, 2024
Location: Valencia, Spain
Abstract:
The intensive and dynamic development of large language models such as ChatGPT and their influence on university teaching and learning has been the subject of much discussion, especially since the beginning of 2023. The contributions focus on suggestions and ideas for redesigning teaching, supporting students and teachers, or adapting specific competence profiles for degree courses and theses. In addition, it is mentioned in many publications that examinations should also undergo significant adjustments and redesigns due to the developments to incorporate the advancement and progress of AI.

However, there is an almost complete lack of practical experience and evaluation results, particularly in the area of summative examinations with larger cohorts. In most cases, the thematic descriptions are limited to conditionals, such as "should," "could," and "would."

In practice, however, there are various challenges for digital examinations using ChatGPT, and there are also research-related questions regarding the conceptual design and the technical implementation of such an examination:
- What specific framework conditions are required for digital exams with the use of ChatGPT to ensure the equality of examinees?
- How must the content of a digital exam be designed so that a meaningful assessment of the examinees' knowledge and skills is possible despite the explicit use of ChatGPT?
- What findings about the examinees and examiners result from the practical implementation and evaluation of a ChatGPT-supported examination?

At the Technical University of Munich (TUM), we have taken the current ideas and proposals regarding the supportive use of AI in summative examinations as an opportunity to design, conduct, and evaluate a large "Pilot ChatGPT Exam" based on the existing state of knowledge.

The motivation for this voluntary and ungraded pilot exam is to gain initial experience with a larger exam in which ChatGPT is explicitly integrated and, in some cases, also required to complete the tasks. The pilot exam was subsequently evaluated in more detail using a special questionnaire to consider not only the students' exam results but also their individual experiences of using ChatGPT during the exam.

In this article, we first present the current state of knowledge on summative exams using ChatGPT and explain the requirements and suggestions for implementing a ChatGPT exam published in the literature and case studies.

Afterward, the article focuses on the concrete design and implementation of the pilot exam with the use of ChatGPT. The framework conditions and technical requirements for implementation, the design of the questions, and the implementation as a supervised lecture hall exam with the LMS Moodle in the BYOD approach are examined in more detail.

The final section summarizes the findings from the evaluation of the pilot exam. The students provide detailed insights into their level of competence with ChatGPT and the concrete possibilities of support from AI for the exam questions and tasks.

A summary and an outlook on future examinations with AI support round off the article.
Keywords:
Summative exam, ChatGPT, Exam evaluation.