在大规模标记数据集的帮助下,深度学习迎来飞速发展,在计算机视觉、自然语言处理和机器人等领域取得广泛应用。然而,在现实场景中收集足量的标记数据往往耗时耗力。为了降低对标记数据的依赖,本文提出的{数据高效的深度学习}旨在通过充分挖掘标记数据和无标数据,达到提升数据利用效率的目的。本文将层层递进地讨论数据高效深度学习的以下难点问题:(1)当标记数据和无标数据来自同一分布时,为无标数据生成伪标签的半监督自训练方法逐渐成为业内主流方法。然而,伪标签并不总是可靠,错误伪标签的误差累积极易误导模型训练,从而带来确认偏差的挑战。(2)进一步地,伪标签在回归问题的连续空间中难以被准确定义,同域半监督回归场景面临如何自训练的难题。(3)另一方面,独立同分布假设在现实场景中往往过于严苛。当标记数据和无标数据跨域分布时,现有方法普遍通过损失函数设计的方式实现跨域分布适配。然而,通用性更强的可迁移神经网络结构却鲜有研究,深度神经网络的内生迁移性明显不足。(4)同时,安全攸关的应用场景高度重视深度模型的不确定度校准,跨域自适应模型的不确定度校准却面临分布漂移和目标领域标记数据缺失的双重挑战。围绕这些难点问题,本文的研究过程可分为以下三个阶段,主要创新点包括:第一阶段,本文探索标记数据和无标数据同分布下的通用学习算法。针对难点(1),提出一种根据伪标签对无标数据进行组对比学习的自微调方法,降低了错误伪标签对模型训练的负面影响;在此基础上,针对难点(2),本文围绕伪标签在回归场景难以定义的问题,提出一种提升模型扰动不变性的极小极大模型。第二阶段,本文深入剖析了深度神经网络的组成单元,分别基于特征图的低阶统计量和领域判别器的概率输出设计了可迁移批量归一化层和可迁移注意力层,提出了针对难点(3)的可迁移深度网络结构,提升了深度神经网络的内生迁移性。第三阶段,针对难点(4)对应的跨域不确定度校准问题,本文根据密度比理论,提出了目标领域校准误差的无偏估计框架。同时,本文基于R\'{e}nyi散度证明了它的方差有界性,并进一步提出串行控制变量法降低了校准误差估计量的方差。此外,本文还设计并实现了数据高效的深度学习算法库EfficientDL,对上述创新性方法进行了体系化整理。最后,本文以电子制造的跨产线焊点缺陷检测作为一种典型案例,介绍了如何实现从本文方法到实际业务平台的技术转化。
With the help of large-scale labeled datasets, deep learning has ushered in rapid development and has been widely used in the fields of computer vision, natural language processing, and robotics. However, collecting enough labeled data in real-world applications is often prohibitively time-costly and labor-expensive. In order to reduce the requirement for labeled data, the data-efficient deep learning proposed in this paper aims to improve data efficiency by fully exploring labeled data and unlabeled data.This paper will discuss the following problems and challenges of data-efficient deep learning: (1) when labeled data and unlabeled data come from the same distribution, the semi-supervised self-training method of generating pseudo labels for unlabeled data has gradually become the mainstream. However, pseudo labels are not always reliable. The error accumulation of false pseudo labels is easy to mislead model training, which brings the challenge of confirmation bias. (2) Furthermore, the pseudo label is difficult to be accurately defined in the continuous space of the regression problem, causing the semi-supervised regression problem to face the challenge of self-training. (3) On the other hand, the assumption of independent and identically distributed is often too strict in the real world. When the labeled data and unlabeled data are distributed across domains, the existing methods generally realize the distribution adaptation through the design of the loss function. However, little attention has been paid to the transferable neural network which is definitely more general, making the intrinsic transferability of deep neural networks is obviously insufficient. (4) At the same time, security-critical applications attach great importance to the uncertainty calibration of deep learning models, but the uncertainty calibration of domain adaptation models faces the dual challenges of distribution drift and the lack of target labels. For addressing these challenges, the research process of this paper can be divided into the following three stages, and the main contributions can be summarized as:In the first stage, this paper explores the general learning algorithms under the same distribution of labeled data and unlabeled data. For addressing challenge one, a self-tuning method of group contrastive learning on unlabeled data according to pseudo labels is proposed to reduce the negative impact of false pseudo-labels on model training; Based on this, for addressing challenge two, this paper proposes a minimax model to enhance the invariance to model stochasticity.In the second stage, this paper delves into the components of deep neural networks and designs a transferable attention layer and a transferable batch normalization layer based on the output of the domain discriminator and the low-order statistics of feature maps respectively for addressing problem three, which effectively improves the intrinsic transferability of deep neural networks.In the third stage, for addressing the problem of cross-domain uncertainty calibration corresponding to challenge four, an unbiased estimation framework of calibration error in the target domain is proposed according to the density ratio theory. Further, this paper proves the boundedness of its variance based on R\'{e}nyi divergence and further proposes a serial control variate method to reduce the variance of the estimated calibration error.In addition, this paper also designs and implements the data-efficient deep learning algorithm library named EfficientDL, and systematizes the above innovative methods. Finally, taking the solder fault detection problem on the cross production line of electronic manufacturing as a typical case, this paper introduces how to realize the technical transformation from the proposed methods to the real-world industry platform.