登录 EN

添加临时用户

基于深度神经网络的稀疏学习

Sparse Learning Based On the Deep Neural Networks

作者:吴凯伦
  • 学号
    2014******
  • 学位
    博士
  • 电子邮箱
    why******com
  • 答辩日期
    2019.12.02
  • 导师
    张长水
  • 学科名
    控制科学与工程
  • 页码
    127
  • 保密级别
    公开
  • 培养单位
    025 自动化系
  • 中文关键词
    稀疏学习,可学习迭代收缩阈值法,收敛性分析,互相干性,网络压缩
  • 英文关键词
    Sparse Learning,Learned Iterative Shrinking Thresholding Algorithms,Convergence Analysis,Mutual Coherence,Network Compression

摘要

稀疏学习作为机器学习的重要分支,被用于解决信号恢复,特征选择,压缩感知,超分辨率重构,图像去噪,图像去模糊等应用问题。稀疏学习的核心数学问题在于基于低维观测向量恢复高维的稀疏特征向量。一般来说,传统的稀疏学习方法存在对字典矩阵之间的相关度要求较高,且收敛速度较慢的弱点。随着深度学习的快速发展和广泛应用,许多传统的机器学习问题通过“深度展开”的思想,把迭代步骤的固有参数设置为可学习的,利用监督数据端到端学习这些参数。其典型代表是可学习迭代收缩阈值法,可学习硬阈值法等方法。这些深度学习方法不仅在实验中被证明是极其有效的,而且在理论上与传统方法相比也被证明存在非常明显的优势。然而,这些深度稀疏学习方法依然存在一些不足之处,例如可学习迭代收缩阈值法会输出不充分的幅度值,大部分深度稀疏学习方法不适用于处理带信号依赖的噪声的信号,以及深度学习模型缺乏对稀疏学习应用问题的更加深刻的理解。除了可以学习稀疏特征,深度神经网络还能够利用学习到稀疏的权重,这有利于网络的压缩和小型化。本文重点研究了四个方面的问题: 针对可学习收缩阈值法存在的幅度值不充分的问题,本文在可学习迭代阈值法的基础上提出了具有理论保障的增益门和超调门。并在理论上修正了原有的不正确的“无假阳性”假设,给出了更加合适的“部分假阳性”的假设,并据此给出了更紧的收敛界。 传统的机器学习针对于迭代收缩阈值法,存在一些有效的对阈值进行自适应调整的方法,为了进一步提升可学习迭代收缩阈值法的性能,同时使之适用不同噪声下的稀疏学习问题,本文提出了基于阈值重加权和阈值受输入控制的可学习迭代收缩阈值法用于处理高斯噪声和泊松噪声下的稀疏学习问题。 针对无稀疏特征监督的稀疏特征信号去噪问题,本文结合生成对抗网络,重构了距离范数损失函数,并利用判别器辅助修正去噪样本,提高了去噪网络的泛化能力。 本文尝试将稀疏分解应用于深度神经网络的权重学习之中,并能够从理论上证明,稀疏矩阵分解下的正则的作用逼近于对原参数矩阵加入2范数正则的效果。逼近程度取决于正则对应的先验分布的三阶矩和共享维度。这套分解框架被成功地应用于例如多层感知机,卷积神经网络,循环神经网络等多种类型的神经网络上。

Sparse learning, as an important branch of machine learning, is used to solve the problems of signal recovery, feature selection, compressed sensing, super-resolution reconstruction, image denoising and image blurring. The core mathematical problem of sparse learning is to restore high-dimensional sparse features based on low-dimensional observation vectors. Generally speaking, the traditional sparse learning methods have the disadvantage of difficult requirements on the correlations between each column vector on dictionary matrix and slow convergence speed. With the rapid development and wide application of deep learning methods, many traditional machine learning algorithms through converting the inherent parameters of the algorithms to learnable ones, which is the idea of "learning to learn". Typical examples are learning iterative shrinkaging thresholding algorithm (LISTA) and learning hard threshold algorithm(LIHT). These deep learning based methods have not only been proved to be extremely effective in experiments, but also proved to have obvious advantages in theory. However, these deep sparse learning methods still have some shortcomings. For instance, the LISTA always output insufficient amplitude values compared with groud-truth, most deep learning based methods are not suitable for dealing with signal-dependent noise signals, and the deep learning model lacks a deeper understanding on the application problems based on sparse learning. Besides, deep neural networks can also ultilize the sparse learning to do network compression and miniaturization. This paper mainly focuses on four aspects as follows: Aiming at the insufficient amplitude of the learnable shrinkage threshold method, the gain gate and overshoot gate with theoretical guarantee are proposed on the basis of the learnable iteration threshold. In theory, the original incorrect "no false positive" inference is corrected, and a more scientific "partial false positive" inference is given, based on which a more compact convergence bound is given. There are some effective methods to adjust the threshold adaptively in the traditional machine learning needle iterative shrinkage threshold method. In order to further enhance the new ability of the LISTA and make it suitable for sparse learning under different noises, a LISTA based on re-weighted threshold and threshold controlled by input are proposed to deal with Gaussian noise and Poisson noise. In this paper, a denoising network with sparse data is proposed based on the generatve neural network. In view of the sparsity of denoising data features, this paper reconstructs the distance norm loss function, and use discriminator to modify denoising samples to improve the generalization ability of denoising network. This paper attempts to adopt sparse matrix decomposition framework into the weight learning of deep neural network. It can be proved theoretically that the effect of regularization under sparse matrix decomposition approximates the effect of $l_2$-norm regularization. The degree of approximation depends on the third order moments of the regular corresponding prior distribution and shared dimension of matrices. This decomposition framework has been successfully applied to various types of neural networks, such as multi-layer perceptron (MLP), convolution neural networks (CNN), recurrent neural networks.