DIGITAL LIBRARY
USING ORACLE QUERY PLAN FOR AUTOMATED ASSESSMENT OF SQL
University of Sunderland (UNITED KINGDOM)
About this paper:
Appears in: INTED2023 Proceedings
Publication year: 2023
Pages: 4978-4982
ISBN: 978-84-09-49026-4
ISSN: 2340-1079
doi: 10.21125/inted.2023.1296
Conference name: 17th International Technology, Education and Development Conference
Dates: 6-8 March, 2023
Location: Valencia, Spain
Abstract:
In our second-year undergraduate database module students are exposed to using the Oracle cloud environment for developing SQL (Structured Query Language) queries. Their first assessment for the module is of a problem-solving nature, whereby they are expected to repair a partially developed SQL schema for a given scenario, before subsequently developing SQL queries based on the scenario.

Students then submit a complete SQL script file containing the corrected database schema and their SQL solutions to the given queries. This is then marked by the module tutors, with the SQL code being marked by hand, which can be laborious, error prone and time consuming. Student SQL solutions can be written in a variety of methods which may all be correct, so sample solutions need to be developed which cater for all potential correct solutions.

The aim of this paper is therefore to describe a prototype automated assessment system for marking the SQL queries developed by the students. Several previous authors have worked primarily in automated assessment in for example programming and marking of ER Diagrams, but there is limited evidence of successful automated assessment of SQL queries. The XData system [1] focuses on generating query specific test cases to catch common errors in queries, whereas Gradience [2] uses fixed query independent or manually constructed datasets in automated assessment of SQL queries. Both of these have issues in identifying all potential solutions for an SQL query, especially in finding solutions which are close but not one hundred percent correct, requiring a number of sample solutions to be provided for each SQL query.

Our proposed solution is to investigate the use of the query optimisation tool in Oracle, alongside various machine learning techniques, to enable automated marking and grading of SQL queries against one sample solution to improve accuracy of automated grading.

This paper will outline the currently developed solution, which is implemented in python, whereby we have been able to successfully extract the optimised query plans from Oracle to enable direct comparison of a collection of sample SQL queries against student solutions. By using Oracle’s query plan, which produces an optimised solution for an SQL query, it allows us to compare queries where, for example, join operations, use of subqueries instead of joins, use of different comparison operators (e.g., LIKE instead of ‘=’), etc., are classified as correct and therefore allow us to identify correct student solutions without the requirement of having to produce all possible permutations of correct solutions. There are currently some minor issues in how the query plan, for example, specifies table and attribute names when, for example, table aliases are used, which we are currently working on and will discuss in the paper. The paper will therefore discuss and provide examples of SQL queries where we are able to match student solutions using real world student solutions, discuss issues which are currently being enhanced, before moving on to investigate how machine learning and natural language processing could be effectively developed as future work to allow for partial grading of student solutions which are not one hundred percent correct according to the sample solution provided by the module tutor. It will outline an undergraduate final year project which is in the early stages of investigating these technologies and how they could be implemented.
Keywords:
SQL, automated marking, assessment.