课堂表情是观察学生学习状态的重要方式,通过信息技术快速检测和识别学生的课堂表情,挖掘课堂表情与学习状态及成绩之间的数量关系,是探索学生的课堂学习规律,辅助教师改进教学的重要技术创新方向。随着计算机运算能力的提升和深度学习算法的不断发展,使得利用轻量级网络实现低分辨率图片的人脸识别和表情识别成为现实。但是,由于缺乏大规模课堂视频的公开数据集、学生外显表情与内隐情绪的不一致性、课堂表情与学习状态以及学业成绩关系的不确定性等原因导致了这些算法在教育领域的应用场景也比较受限。本研究基于六种基础情绪的分类框架以及根据课堂学习状态优化的表情分类框架,对北京市某中学长期追踪的课堂视频数据进行表情识别和表情数据分析。首先使用三阶级联卷积神经网络MTCNN实现了学生人脸检测,结合学生位置坐标信息完成了教室场景下的人脸识别,并创建了学生课堂人脸数据集。其次,本研究选取了两种表情分类框架:(1)六种基础表情分类框架,(2)根据课堂学习的特点,自设了四种课堂学习状态表情分类框架,并对数据集的部分图片进行了人工标注。本研究基于人工标注的学生人脸数据集,完成了轻量级卷积神经网络Island Network的模型训练,实现了基于6种基本表情分类框架和基于学习状态表情分类框架的课堂表情识别。最后对表情识别结果进行统计分析,探索其与课堂学习状态之间的关系。研究结果表明中性表情是课堂中能识别的主要表情,微笑表情比例其次。学生的课堂微笑表情与专注程度无关,学生的课堂表情分布存在性别差异和科目差异,女生的微笑表情比例高于男生,语文课程中的微笑表情比例低于数学课程,学生的学业成绩与课堂微笑表情呈现负相关关系。基于课堂学习状态的表情识别结果表明,学生的课堂表情也存在性别差异和科目差异,男生的疲倦表情比例显著高于女生,数学课程的疲倦表情比例显著高于语文课程,男女生的学业成绩与课堂疲倦表情呈现显著的负相关关系。
Classroom expression is an important way to observe students' learning state. Detection and recognition of students' classroom expression through information technology rapidly, mining the quantitative relationship between classroom expression and learning state and achievement are important technological innovation directions to explore students' classroom learning rules and assist teachers to improve teaching. With the improvement of computer computing ability and the continuous development of deep learning algorithm, it is a reality to realize face recognition and facial expression recognition of low resolution pictures by using lightweight network. However, due to the lack of public data set of large-scale classroom video, the inconsistency between students' explicit expression and implicit emotion, the uncertainty of the relationship between classroom expression and learning status and academic achievement, the application scenarios of these algorithms in the field of education are also relatively limited.Based on the classification framework of six basic emotions and the expression classification framework optimized according to the classroom learning state, this study carries out expression recognition and analysis on the classroom video data tracked by a middle school in Beijing for a long time. Firstly, the three-level convolutional neural network MTCNN is used to achieve the student face detection. Combined with the student position coordinate information, the face recognition in the classroom scene is completed, and the student classroom face data set is created. Secondly, this study selects two expression classification frameworks: (1) six basic expression classification frameworks, (2) classroom learning status. Based on the manually labeled student face data set, this study completes the model training of lightweight convolutional neural network called island network, and realizes classroom expression recognition based on six basic expression classification frameworks and learning state expression classification framework. Finally, the results of expression recognition are statistically analyzed to explore the relationship between expression recognition and classroom learning state. The results show that neutral expression is the main expression that can be recognized in class, and the proportion of smiling expression is the second. Students' classroom smile expression has nothing to do with their degree of concentration. There are gender differences and subject differences in the distribution of students' classroom expression. The proportion of girls' smile expression is higher than that of boys, and the proportion of smile expression in Chinese course is lower than that in mathematics course. Students' academic performance is negatively correlated with classroom smile expression. The results of expression recognition based on classroom learning state show that there are also gender and subject differences in students' classroom expression. The proportion of tired expression of boys is significantly higher than that of girls, and the proportion of tired expression of mathematics course is significantly higher than that of Chinese course. There is a significant negative correlation between male and female students' academic performance and tired expression.