DIGITAL LIBRARY
MACHINE IDENTIFICATION OF COLLABORATION DIALOGUE ACTS IN TYPED-CHAT COLLABORATIVE PROBLEM-SOLVING
1 Valparaiso University (UNITED STATES)
2 North Carolina A&T State University (UNITED STATES)
About this paper:
Appears in: EDULEARN20 Proceedings
Publication year: 2020
Page: 6089
ISBN: 978-84-09-17979-4
ISSN: 2340-1117
doi: 10.21125/edulearn.2020.1597
Conference name: 12th International Conference on Education and New Learning Technologies
Dates: 6-7 July, 2020
Location: Online Conference
Abstract:
The aim of this study is computer identification of collaborative dialogue acts within student group problem-solving conversations. The COMPS project administers group problem-solving exercises in undergraduate college classes. The students collaborate in groups of three or four, communicating via typed chat. A goal of the project is to build the knowledge needed to gauge the quality of the collaborative dialogue, in particular whether the participants are engaging with each other. Ultimately the goal is to have computer monitoring of the dialogues, providing a summary to the instructor.

In this study approximately a thousand dialogue turns from ten different student conversation groups were classified according to four broad categories of dialogue acts:
A) sharing ideas,
B) negotiating (which includes agreeing or disagreeing),
C) regulating the dialogue or problem-solving activity,
D) non-problem-solving chit-chat. Within problem-solving dialogues, sequences of these these dialogue acts occur in characteristic patterns.

One high-frequency pattern is A-B (sharing followed by negotiating) where one student advances an idea and another student responds. This pattern is what would be expected of transactive group knowledge-building dialogue. Note that merely detecting the presence of D off-topic dialogue turns isn’t necessarily diagnostic of unproductive conversation. In productive dialogues, D turns occur in the spaces between discussing different segments of the problem prompt, a pattern which can be recognized from the C dialogue regulation turns.

This study tests two computational paradigms to extract features from the dialogue text. One is the doc2vec neural network, the other is statistically identifying characteristic words or phrases.

Doc2vec derives text embeddings, vectors of real numbers where the vectors for semantically similar texts are numerically similar. The embedding for a whole dialogue turn becomes a set of features for a linear classifier to recognize the dialogue act. We also experimented with utilizing the preceding one or two turns as context, where the dialogue act of a turn is classified based on the embeddings the turn along with its preceding turns.

The motivation for diagnostic words comes from manually examining the transcripts, e.g. a phrase “I agree” may indicate a B negotiating turn. The text was pre-processed to contain only the most common 10,000 words, which filters out many words specific the technical problems under discussion. The Fisher exact test scored which words co-occur with the different dialogue acts with much higher than chance probability. The presence or absence of each characteristic word is then a feature for the linear classifier.

Results show that doc2vec word embeddings identify A dialogue acts with an F1-score (combined precision and recall) of about 0.6 to 0.7. For B and C acts F1 ≈ 0.5, for D off-topic turns F1 ≈ 0.4. Adding the immediately preceding turns as context does not improve classification accuracy. Classifiers trained using the diagnostic word features have not been better than the embeddings.

In addition to the ongoing work of deriving better classifiers, future work includes using the classifiers to try to recognize and count the prevalence of dialogue act sequences. We will experiment to see if they count A-B sharing-negotiating pairs with enough accuracy, for example, to correctly distinguish dialogues where students are engaging in problem-solving.
Keywords:
Collaborative dialogue, dialogue acts, computer-assisted collaborative learning.