VALIDATING AI-GENERATED CLASSROOM OBSERVATIONS: A PROOF-OF-CONCEPT STUDY
Universidad de los Andes (CHILE)
About this paper:
Conference name: 20th International Technology, Education and Development Conference
Dates: 2-4 March, 2026
Location: Valencia, Spain
Abstract:
This proof-of-concept study examines the validity of an AI-assisted classroom observation tool designed to assess instructional quality through video analysis using the World Bank’s TEACH framework. The tool aims to support scalable and context-sensitive teacher development by generating formative feedback aligned with evidence-based teaching practices.
We analyzed 20 primary school classroom recordings using two methods: trained expert observers manually coded each session using the TEACH rubric and produced narrative justifications; concurrently, the AI system generated numerical scores and written feedback based on the same framework. Correlation analyses were conducted to estimate agreement across TEACH domains, including classroom culture, time on task, and support for cognitive and socioemotional development. We also compared AI-generated narratives with the “master justifications” used in expert training and calibration.
Findings revealed moderate to strong correlations in structured, observable domains such as time management and instructional clarity. However, alignment was weaker in dimensions requiring more nuanced socioemotional or relational interpretation. This suggests that while AI can replicate certain patterns of expert judgment, it currently struggles to fully capture culturally embedded or affective dimensions of teaching quality.
The significance of this study lies in its potential to inform the development of hybrid teacher feedback systems that are both scalable and grounded in professional standards. As education systems face increasing pressure to deliver high-impact teacher development at scale—often with limited resources—AI offers new opportunities to supplement human observation. However, empirical validation is essential to ensure such tools are accurate, ethical, and culturally responsive.
By providing early evidence on the feasibility and limitations of AI-generated classroom observations, this study contributes to a growing conversation about the role of automation in education. It also raises critical questions about the boundaries between algorithmic support and human judgment in evaluating complex instructional practices. Ultimately, this research lays the groundwork for future iterations of observation tools that integrate the strengths of AI with the deep contextual knowledge of educators, enabling more equitable and effective professional development across diverse educational contexts.Keywords:
Classroom Observation, Artificial Intelligence in Education, Teacher Feedback, Teacher Professional Development.