登录 EN

添加临时用户

高效稳健卷积神经网络架构的搜索方法研究

Neural Architecture Search for Efficient and Robust Convolutional Neural Networks

作者:宁雪妃
  • 学号
    2016******
  • 学位
    博士
  • 电子邮箱
    fox******com
  • 答辩日期
    2021.09.04
  • 导师
    杨华中
  • 学科名
    电子科学与技术
  • 页码
    142
  • 保密级别
    公开
  • 培养单位
    023 电子系
  • 中文关键词
    卷积神经网络, 神经网络架构搜索, 容错性, 鲁棒性, 神经网络模型压缩
  • 英文关键词
    Convolutional Neural Network, Neural Architecture Search, Fault Tolerance, Adversarial Robustness, Neural Network Compression

摘要

卷积神经网络(Convolutional Neural Networks, CNNs)已被广泛应用在各种视觉任务上, 而CNN的架构对其性能和效率很关键。长时间以来, 研究者们通过人工设计新的CNN架构提升着CNN的任务性能和效率。然而, 由于完全依赖人工经验进行架构设计, CNN架构的演化过程较慢。而且, 由于现实世界中存在大量任务场景和硬件平台, 完全依赖专家经验设计CNN架构使得在大量现实应用场景下部署并调优CNN的成本巨大。因此, 更自动化的架构设计, 即神经网络架构搜索(Neural Architecture Search, NAS), 成为了最近的热门研究话题。本文面向高效稳健CNN系统展开了架构搜索方法研究。首先, 由于架构搜索空间巨大且架构评估开销大, NAS算法面临着巨大的资源消耗挑战。而且, 由于NAS系统中可变因素太多, 在没有统一而模块化的软件实现时, 研究者们难以可靠的验证NAS算法的有效性。针对以上在核心算法和软件框架两方面的挑战, 本文针对NAS算法的搜索策略、评估策略两个核心组件展开了研究并提出了改进, 然后将所提出的NAS算法模块化地实现在了aw_nas软件框架里。接着, 基于核心算法和软件框架的开发, 本文针对CNN系统的稳健性和高效性展开了架构搜索实践, 从架构设计维度帮助构建更为高效、鲁棒、容错的卷积神经网络系统。本文的主要贡献总结如下:* 核心算法: 本文针对CNN架构的数据特性和NAS的问题特性设计了架构性能预测器的构建和训练方法, 显著提升了搜索策略的采样效率。 另一方面, 本文提出了快速架构评估策略的评估质量衡量框架, 全面地揭示了快速评估策略的研究状态, 并为未来架构评估策略的研究和应用提出了建议。本文提出的分析框架也应该被未来工作使用。* 软件框架: 本文开发了模块化的NAS软件框架, aw_nas。统一的软件框架可以帮助研究者们更清晰地比较不同NAS算法的有效性, 并帮助开发者更方便地将NAS算法应用在各种应用场景里。* 面向稳健、高效CNN系统的应用: 本文设计NAS流程从架构设计维度提升了CNN对于硬件计算错误的容错性和对于对抗攻击的鲁棒性。在面向高效CNN系统的应用方面, 本文将NAS技术与模型压缩(剪枝、量化)相结合, 设计了NAS流程以得到具有更好任务性能的轻量级架构。

Convolutional Neural Networks (CNNs) have been widely used in various vision tasks, and the architecture design of CNN is critical to its performance and efficiency. For a long time, researchers have been manually designing new CNN architectures to improve their task performance and efficiency. However, the evolution of CNN architectures was relatively slow due to the purely manual architecture design. Moreover, there exist a large number of tasks and hardware platforms in real-world application scenarios, and relying on purely manual architecture design brings large costs and is not scalable for the vast application scenarios. Therefore, Neural Architecture Search (NAS), as an automatic method for architecture design, has become a hot topic in recent research. This thesis presents a study on neural architecture search methods for efficient and robust CNN systems. First, as the architecture search space is huge and the evaluation cost of each architecture is large, NAS methods face huge resource consumption challenges. Moreover, since there are too many variable factors in the NAS system, it is difficult to reliably verify the effectiveness of the NAS algorithm without a unified and modularized software implementation. In response to the above challenges in the core algorithm and software framework, this thesis proposes methods to improve the two core components of the NAS algorithm, search strategy, and evaluation strategy. Then, this thesis develops a modularized NAS framework aw_nas, which implements the NAS algorithms proposed by this thesis, as well as other effective NAS methods. Then, this thesis designs NAS workflows to help build a more efficient, robust, and fault-tolerant CNN system from the architecture design perspective. The main contributions of this thesis are summarized as follows:* Core algorithm: This thesis designs novel construction and training methods of architecture performance predictors according to the data characteristics of the CNN architectures and the problem characteristics of NAS. The proposed predictor can significantly improve the sampling efficiency of the search strategy. On the other hand, this thesis proposes an evaluation quality analysis framework of architecture evaluation strategies, and reveals the status of current fast architecture evaluation strategies. Through extensive experimental results, this thesis gives out suggestions for future research and application of architecture evaluation strategies. And the analysis framework should also be used in future work.* Software framework: This thesis develops a modularized NAS software framework, aw_nas. Using a unified software framework, researchers can compare and understand the effectiveness of different NAS algorithms more controllably and clearly, and developers can apply NAS algorithms to their application scenarios more conveniently.* Application for robust and efficient CNN system: This thesis designs NAS workflows to improve the CNN's fault tolerance for hardware computation errors and robustness against adversarial attacks from the architectural design perspective. Also, this thesis combines the NAS methods with model compression methods (pruning, quantization), and the developed NAS processes can help get lightweight architectures with better task performance.