Classroom discourse is a core medium of instruction -- analyzing it can provide a window into teaching and learning as well as driving the development of new tools for improving instruction. We introduce the largest dataset of mathematics classroom transcripts available to researchers, and demonstrate how this data can help improve instruction. The dataset consists of 1,660 45-60 minute long 4th and 5th grade elementary mathematics observations collected by the National Center for Teacher Effectiveness (NCTE) between 2010-2013. The anonymized transcripts represent data from 317 teachers across 4 school districts that serve largely historically marginalized students. The transcripts come with rich metadata, including turn-level annotations for dialogic discourse moves, classroom observation scores, demographic information, survey responses and student test scores. We demonstrate that our natural language processing model, trained on our turn-level annotations, can learn to identify dialogic discourse moves and these moves are correlated with better classroom observation scores and learning outcomes. This dataset opens up several possibilities for researchers, educators and policymakers to learn about and improve K-12 instruction. The data and its terms of use can be accessed here: https://github.com/ddemszky/classroom-transcript-analysis
翻译:课堂谈话是教学的核心媒介 -- -- 分析它能够提供教学和学习的窗口,推动开发改进教学的新工具。我们引入了研究人员可获得的数学课堂记录的最大数据集,并展示了这些数据如何帮助改进教学。数据集由2010-2013年期间国家教师效能中心收集的1,660 45-60分钟长4年级和5年级基本数学观测组成。匿名记录来自4个学区的317名教师的数据,这些教师大多为历史上被边缘化的学生服务。记录中含有丰富的元数据,包括对话对话交流运动、课堂观察分数、人口信息、调查答复和学生测试分数的翻转级说明。我们展示了我们天然语言处理模式,通过对转级说明进行培训,可以识别对话性对话动作,这些动作与更好的课堂观察分数和学习结果相关。这一数据集为研究人员、教育工作者和决策者学习并改进K-12教学提供了几种可能性。数据及其使用条件可以在这里查阅:https://github.com/ddemsky/clasroom-transim-assimation。