登录 EN

添加临时用户

基于任务可迁移性的跨域跨任务迁移学习

Transferability-Guided Cross-Domain Cross-Task Transfer Learning

作者:谭杨
  • 学号
    2019******
  • 学位
    博士
  • 电子邮箱
    tan******.cn
  • 答辩日期
    2023.06.27
  • 导师
    LI YANG
  • 学科名
    数据科学和信息技术
  • 页码
    106
  • 保密级别
    公开
  • 培养单位
    600 清华-伯克利深圳学院
  • 中文关键词
    迁移学习,可迁移性估计,跨域跨任务迁移,模型微调,领域泛化
  • 英文关键词
    transfer learning, transferability estimation, cross-domain cross-task transfer, model finetuning, domain generalization

摘要

近年来,有监督深度学习极大地推动了人工智能(AI)技术的进步,并且已经被广泛地应用在计算机视觉、自然语言处理、AI内容生成等领域。然而有监督深度学习的成功依赖于大规模有标签的训练数据,以及假设测试数据和训练数据来自同一领域(数据分布)。这些条件在变化多样的现实场景中并不总是能得到满足。因此,如何从少量的有标签数据中进行有效地学习,并克服领域差异带来的负面影响是一个亟待解决的问题。迁移学习是解决该问题最有效的学习范式之一。它能迁移已有的相关知识去帮助小样本目标任务的学习。但是如何高效准确地找到最有效的迁移策略,即理解“何时迁移”和“迁移什么”,仍然是迁移学习中最基础和最具挑战性的问题。 针对上述挑战,本论文系统性地做出了三方面的工作。首先,我们从可迁移性估计的角度去解决分类任务上的“何时迁移”问题。本文提出了一个有效的可迁移性度量框架 OTCE (Optimal Transport based Conditional Entropy), 即基于最优传输理论的条件熵方法。在具有挑战性的跨域跨任务迁移场景中,它可以高效准确地预测一个源任务迁移到目标任务上的效果,从而帮助我们选择最适合迁移的源任务。此外,本文还证明了它可以提升多源特征融合任务的表现。 其次,本文研究了语义分割任务上的可迁移性估计问题。语义分割任务在自动驾驶、医疗图像诊断等领域有着重要的价值。如何准确高效地估计语义分割模型的可迁移性仍然是一个未解决的问题。对此,本文提出了一个通用的适应方法,使得已有的针对分类或回归任务的可迁移性度量准则能扩展到语义分割任务上并做出有效的预测。此外,本文提出将可迁移性指标作为加权系数融入到语义分割模型的训练中,使得低可迁移性的区域获得更高的训练权重,从而整体上获得更好的语义分割结果。 最后,本文从学习更具泛化性的特征表达的角度解决“迁移什么”问题。针对迁移学习中代表性的两个任务,即模型微调和领域泛化,本文提出将可迁移性作为引导信息去帮助学习更易迁移的特征表达。在相应的算法中,我们将OTCE准则作为可优化的目标函数。通过最大化可迁移性目标函数值对源模型参数进行优化,使得学习到的特征表达更容易被迁移到目标任务上,最终取得更好的迁移效果。

In recent years, deep supervised learning has greatly advanced the progress of artificial intelligence (AI) technology, and it has been widely applied in various research areas including computer vision, natural language processing, AI generated content, etc. However, the success of deep supervised learning relies on large-scale labeled training data. It assumes that training and testing data originate from the same domain (data distribution). This assumption is not always satisfied in diverse practical scenarios, such that how to learn effectively from limited labeled data and eliminate the negative effects of domain shifts is an emerging urgent problem. Transfer learning is one of the most effective learning paradigms for solving this problem. It utilizes shared knowledge in the source domain or task to enhance the learning of few-shot target tasks. However, efficiently finding the optimal transfer strategy, i.e., understanding when to transfer and what to transfer, remains the most fundamental and challenging problem in transfer learning. To address the above challenges, this thesis systematically makes three contributions. Firstly, we address the when to transfer problem in classification tasks from the perspective of transferability estimation. We propose an effective transferability estimation framework, namely OTCE (Optimal Transport based Conditional Entropy), for efficiently and accurately predicting the transfer performance of a source-target task pair in the challenging cross-domain cross-task transfer learning scenarios. It thus serves as an indicator to help us choose highly transferable source tasks. Additionally, we demonstrate its effectiveness in boosting the performance of multi-source feature fusion. Secondly, we investigate the transferability estimation problem in semantic segmentation tasks which is of great value in research areas like autonomous driving and medical image diagnosis. Efficiently computing the transferability of semantic segmentation models remains an unsolved problem. In this regard, we propose a flexible, universal adaptation method, to generalize existing transferability metrics proposed for classification or regression tasks to semantic segmentation tasks, and produce accurate transferability estimation results. Furthermore, we incorporate transferability scores as weighting coefficients into the model training, to encourage more attention to those low-transferability regions, ultimately leading to a higher segmentation accuracy. Finally, we interpret the what to transfer problem as learning more generalizable feature representations. We propose to utilize transferability as guidance to help the adaptation of source features in two representative tasks: model finetuning and domain generalization. In particular, we use our proposed OTCE metric as an optimization objective function. The source model will be optimized by maximizing the transferability score, such that the learned feature representation is easier to be transferred to the target task, and ultimately achieves a higher transfer accuracy.