VIDEO CLICKSTREAM DATA VISUALIZATION AND CLUSTERING FOR THE IMPROVEMENT OF LEARNING RESOURCES AND THE FACILITATION OF INTERVENTION
Universitat Oberta de Catalunya (SPAIN)
About this paper:
Conference name: 13th International Technology, Education and Development Conference
Dates: 11-13 March, 2019
Location: Valencia, Spain
Abstract:
Currently the importance of video-based resources for teaching can’t be argued against, in all types of teaching (fully distance learning, blended learning, face to face) and at all scales, from the massive to very small class groups, in flipped classroom situation, for example. Thus, it is very important to analyse the use of those resources in all kinds of teaching environments. Most, if not all, current tools and methodologies are oriented towards MOOC-scale use and focus on the global, aggregate data. Therefore, they are very useful as tools to detect dropout or problematic points in a video. This focus results on tools and methodologies of limited usefulness at medium or small scales, not providing actionable data unless hundreds of students have watched a video and not able to provide useful information on a single student basis to assist with teaching intervention and/or the personalization of the teaching-learning process. Therefore, for most sizes of classrooms, a different approach is needed.
We have developed a tool to collect clickstream data for video and a visualization of those collected data on a single session basis (currently sessions are kept fully anonymous in order to comply with privacy regulations, but the option to uniquely identify students is kept). This visualization has been proved to be of interest to teachers, revealing unexpected student behavior consistently across different kinds of videos in diverse courses. In particular, skipping and pausing are shown to be much more prevalent than instructors assume.
Conversely to the limitations of large scale visualizations for medium or small-scale data, our first can be overwhelming when presenting data for medium sized groups of sessions if no filtering is provided. Thus, a solution is needed to avoid overwhelming instructors or content producers without data analysis skills. We have explored the use of clustering techniques for this objective, allowing the grouping of sessions with similar behavior.
One challenge in clustering clickstream data is that those data give as a result trajectories which, being moderately high-dimensional data, aren’t directly analysable through the use of the most common distances and summarization methods used in the field. We have explored the available metrics to measure distances between curves, such as the Fréchet distance, and have finally settled on dynamic time warping as a measure of similarity for our analysis. The use of agglomerative clustering techniques has proved moderately useful to separate sessions according to similarity and have provided some insight to involved instructors.
In conclusion, we have created a first, fully-functional prototype of a tool allowing instructors and video content producers a fine analysis of student behaviour while watching said resources that should facilitate both the detection of global issues with the resources and surfacing issues affecting smaller groups of students that should lead to teaching intervention.Keywords:
Video, learning analytics.