登录 EN

添加临时用户

基于彩色高分辨率图像引导的深度图超分辨率和抠图技术

Depth map super-resolution and matting technology based on color high-resolution image guidance

作者:崔英杰
  • 学号
    2019******
  • 学位
    硕士
  • 电子邮箱
    815******com
  • 答辩日期
    2022.05.19
  • 导师
    廖庆敏
  • 学科名
    电子与通信工程
  • 页码
    76
  • 保密级别
    公开
  • 培养单位
    023 电子系
  • 中文关键词
    彩色图像引导超分辨率, 深度图像, 人像抠图, 特征融合, 多尺度特征
  • 英文关键词
    RGB guided super-resolution, Depth map, Portrait matting, Feature fusion, Multiscale feature

摘要

彩色图像引导超分辨率(RGB guided super-resolution)是一项可以应用于多种场景的图像技术,主要利用高分辨率彩色图像提供的边缘细节来对低分辨率图像进行边缘优化,从而实现整体分辨率提升的目标。本文利用彩色图像引导超分辨率技术,对深度图像超分辨率任务与人像抠图任务中的低分辨率图像进行处理,得到对应的高质量、高分辨率图像,提高其应用价值。 无论是基于立体匹配的双目相机,还是基于结构光与飞行时间的RGB-D相机,在限制成本与体积的前提下,它们所捕获的深度图像分辨率都比较低,比如Kinect v2仅仅能够得到分辨率为512×414的深度图像,面临着边缘信息丢失和局部失真的窘境。幸运的是,RGB-D相机能够在获取低分辨率深度图像的同时捕获对应的高分辨率彩色图像,所以能采用彩色图像引导技术来实现低分辨率深度图像超分辨率(Depth map Super-Resolution,DSR)。为此,本文提出了基于耦合U-Net(Coupled U-Net,CU-Net)的彩色图像引导深度图像超分辨率算法,使用两个耦合的U-Net建立深度图像分支与彩色图像分支以获取更深层次的多尺度信息。CU-Net使用了双重跳跃连接结构建立深度图像与彩色图像的联系,并在网络末端利用多尺度特征重建模块将每一级融合的结果都用于最后的重建。实验表明,CU-Net能够达到目前较为先进的性能。 而在人像抠图领域,用户的输入往往是一个高分辨率的视频,如果每次只为网络输入单独帧图像,网络将无法利用前后帧的信息作为参考且处理的结果会缺少连续性。但如果直接将多帧高分辨率图像同时输入网络将导致巨大的运算开销,所以本文设计了基于彩色图像引导与注意力机制的无辅助输入人像视频抠图(Attention and Guided RVM,AGRVM)算法。AGRVM首先将多帧高分辨率图像进行降采样,利用得到的低分辨率连续帧训练粗略抠图网络,最后再使用彩色图像引导抠图超分辨率模块(RGB Guided Matting super-resolution,RGBG-MSR)对低分辨率的抠图结果进行引导超分辨率,从而达到兼顾视频连续性、性能与速度的目标。

RGB guided super-resolution is an image technology that can be applied to a variety of scenes. It mainly uses the edge details provided by high-resolution color images to optimize the edge of low-resolution images, so as to achieve the goal of improving the overall resolution. In this paper, the RGB guided super-resolution technology is used to process the low-resolution images in the depth map super-resolution task and the portrait matting task, so as to obtain the corresponding high-quality and high-resolution images and improve their application value. Whether binocular cameras based on stereo matching or RGB-D cameras based on structured light and time of flight, the resolution of depth maps captured by them are relatively low on the premise of limiting cost and volume. For example, Kinect V2 can only obtain depth maps with resolution of 512×414, facing the dilemma of edge information loss and local distortion. Fortunately, the RGB-D camera can capture the corresponding high-resolution color images while acquiring the low-resolution depth maps, so the RGB guide super-resolution technology can be used to realize the depth map super resolution (DSR). Therefore, we propose a RGB guided depth map super-resolution algorithm based on coupled U-Net (CU-Net). Two coupled U-Nets are used to establish depth map branch and color image branch to obtain deeper multi-scale information. CU-Net uses the dual skip connection(DSC) structure to establish the connection between depth map branch and color image branch, and uses the multi-scale feature reconstruction(MSFR) module at the end of the network to use the fusion results of each level for the final reconstruction. Experiments show that CU-Net can achieve better performance than other methods used for comparison. In the field of portrait matting, the user's input is often a high-resolution video. If only a single frame image is input into the network each time, the network will not be able to use the information of the previous and subsequent frames as a reference, and the processing result will lack continuity. However, if multiple high-resolution frames are directly input into the network at the same time, it will lead to huge computational overhead. Therefore, this paper designs an AGRVM algorithm based on color image guidance and attention mechanism. AGRVM first downsampled multiple high-resolution images, trained coarse-graining network with the resulting low-resolution continuous frames, and then guided the low-resolution results with the RGB Guided Matting super-resolution module (RGBG-MSR) to achieve the goal of considering video continuity, performance and speed.