DIGITAL LIBRARY
AN ATTEMPT TO ESTIMATE HESITATION DURING ENGLISH WORD-ORDERING TASKS USING GAZE INFORMATION
Shizuoka University (JAPAN)
About this paper:
Appears in: INTED2026 Proceedings
Publication year: 2026
Article: 2336
ISBN: 978-84-09-82385-7
ISSN: 2340-1079
doi: 10.21125/inted.2026.2336
Conference name: 20th International Technology, Education and Development Conference
Dates: 2-4 March, 2026
Location: Valencia, Spain
Abstract:
We are developing a web application that estimates the “degree of hesitation” that occurs when learners solve English word-ordering tasks, in which randomly arranged English words must be rearranged to match a given Japanese sentence. Previous studies focused on improving estimation accuracy by designing new features related to mouse-trajectory data. According to the evaluation metric F-score, the estimation accuracy reached approximately 84% at the problem level and 68% at the word level. In the present study, as a different approach from mouse-trajectory information, we newly measure and collect learners’ eye-gaze data. Eye movements directly indicate which information the learner is attending to, and can capture how their gaze shifts between the Japanese sentence and the English words even during moments of deliberation when the mouse is not moving. By using both mouse-trajectory data and gaze-tracking data complementarily, we aim to estimate hesitation states that could not be captured in previous studies and thereby improve estimation accuracy. In this paper, with a view toward practical use in broader educational settings, we propose a framework for a custom application that measures eye gaze using only a standard webcam.

We attempted to build a custom device-free application that does not require a dedicated eye tracker, to develop a language-learning support tool that is more accessible and next to cost-free. The application under development measures gaze information using a standard webcam attached to a typical PC. Because webcam-based tracking relies only on facial and eye-region image data, it is more susceptible to head movements and lighting conditions compared with infrared-based dedicated eye trackers. To mitigate this issue, we adopted a problem-design approach. Specifically, we redesigned the task from an English word-ordering task to a listening-comprehension task in which learners select multiple on-screen buttons in accordance with a problem statement or audio. We also newly designed an interface in which the buttons (choices) are large and spaced widely apart.

We use Random Forest as the classifier, and the labels for machine learning are binarized into “considerably hesitant” and “hardly hesitant.” The feature set used in a previous study consists of:
- Answer time
- Total mouse-movement distance
- Time before and after first drag
- Average and maximum mouse-movement speed
- Maximum mouse-idle time
- Number of drag and drop (D&D) operations
- Maximum D&D duration
- Maximum and total interval between D&D operations
- Number of mouse U-turns (X-axis, Y-axis)

The newly added gaze-based features are:
- Total gaze-point movement distance
- Number of gaze-point U-turns (X-axis, Y-axis)
A gaze-point U-turn is defined similarly to the mouse U-turn, based on directional reversal along each axis.

In this study, we proposed a framework that incorporates gaze information as new features in addition to mouse-trajectory data to improve the accuracy of estimating hesitation in English word-ordering tasks. Future work includes further refinement of gaze-based features, such as modeling gaze-transition patterns between specific words and measuring gaze-movement frequency among the Japanese sentence, answer area, and problem-display area. We also plan to collect data from a larger number of participants and conduct more extensive experiments to rigorously validate the effectiveness of the proposed method suitable for use in educational settings.
Keywords:
Hesitation, word-ordering task, gaze, Web application, estimation.