随着智能设备的不断普及,人机交互中情感拟人化的需求不断增强。面部表情识别作为感知和理解人类情感的重要途径,受到学术界和工业界的广泛关注。然而,现有的面部表情识别方法主要聚焦在表情类别固定的场景中,其识别任务和数据分布均在训练和测试中保持不变,但这种任务范式无法准确反映现实生活的复杂场景。因此本文总结分析了变化类别的面部表情识别所面临的三个重大挑战,即(1)识别类别动态增加;(2)训练类别获取受限;(3)识别类别开放未知。本文从增量识别、单分类识别、开放集识别三个方面展开研究,主要内容包括:1)提出了面部表情的增量识别方法,将表情识别的对象从静态固定类别拓展到动态增加类别,同时极大降低了迭代优化过程中的时间和空间开销。该方法利用典型表情集合选择性保留原有类别样本,缓解了训练集中的类别不平衡。为了解决灾难性遗忘问题,通过中心化情绪蒸馏损失引导表情特征提取,增强了模型对于原有类别的识别稳定性。实验结果表明,该算法可以提高面部表情识别的灵活性并降低类别增加过程中的精度损失。2)提出了单分类面部表情识别方法,实现了异常表情数据的检测和过滤。该方法突破单一信息源限制,引入相关领域的专家知识作为条件约束,构建了源任务和目标任务的聚合特征空间,增强了已知类别特征分布的紧致性。实验结果表明,该算法可以提升面部表情识别的鲁棒性,通过可视化揭示了专家知识引导的特征提取机制。3)提出了开放集面部表情识别方法,构建了包含360名受试者的多光照异常表情数据集,实现了异常表情检测和目标表情识别任务的协同工作。该方法基于自适应注意力机制,并融合层次化异方差异常建模,实现了多层次的表情特征提取和正常表情分布的高效建模。在此基础上,进一步提出了基于双注意力机制的开放集面部表情识别方法,通过引入双注意力特征增强机制,根据开放集识别的需要对特征进行了协同融合与增强。实验结果表明,这两个方法均可以实现面向异常表情的高效检测和面向目标类别的精准识别。 此外,上述算法被应用于面部表情识别系统的搭建过程中,并在公交驾驶环境中进行了应用验证。
As intelligent devices become more and more popular, the demand for emotional anthropomorphic upgrades in human-computer interaction is increasing. As an important way for machines to perceive, understand and respond to human emotions, facial expression recognition (FER) has attracted extraordinary attention in academia and industry. Existing FER methods mainly focus on scenes with fixed expression categories, and their recognition tasks and data distribution remain unchanged during training and testing. However, this task paradigm cannot accurately reflect the complex scenes of real life. Therefore, this thesis summarizes and analyzes three major challenges faced by FER with changing categories, including 1) dynamic increase of recognition categories, 2) limited access to training categories, 3) unknown distribution of recognition categories. Research is carried out from three aspects: incremental recognition, one-class recognition, and open-set recognition. The main contents include:1) An incremental FER algorithm is proposed, which expands the objects of expression recognition from static fixed categories to dynamically added categories, while greatly reducing the time and space overhead in the iterative optimization process. This method uses typical expression collections to selectively retain original category samples and alleviate the category imbalance in the training set. In order to solve the problem of catastrophic forgetting, the centralized emotion distillation loss is used to guide expression feature extraction, which enhances the recognition stability of the original categories. Experimental results show that the algorithm can improve the flexibility of FER and reduce the accuracy loss during category addition.2) A one-class FER algorithm is proposed to realize the detection and filtering of abnormal expression data. This algorithm breaks through the limitation of the single information source, introduces expert knowledge in related fields as conditional constraints, constructs an aggregated feature space of source tasks and target tasks, and enhances the compactness of known category feature distributions. Experimental results show that the algorithm can improve the robustness of FER and reveal the feature extraction mechanism guided by expert knowledge through visualization.3) Two open-set FER methods are proposed to realize the collaborative work of abnormal expression detection and target expression recognition tasks. A multi-illumination abnormal expression data set containing 360 subjects is constructed. The first method is based on a two-stage adaptive attention mechanism and a hierarchical anomaly modeling mechanism, achieving multi-level expression feature extraction and e?icient modeling of normal expression distribution. On this basis, the second open-set FER method is further proposed based on bi-attention mechanism. By introducing a bi-attention feature enhancement mechanism, the features are collaboratively fused and enhanced according to the needs of open set recognition. Experimental results show that both methods can achieve e?icient detection of abnormal expressions and accurate recognition of target categories.In addition, the above algorithms are applied in the construction process of the FER system, and their application is verified in the bus driving environment.