引起图像退化的因素包括运动模糊、噪声干扰、散焦和复杂环境条件变化等多种情形。图像复原技术旨在从退化的图像中恢复出丢失的细节信息,结合目标检测任务可以广泛用于城市安全、医学影像及消费电子等多个领域。本文研究了多种空间分辨率和时间分辨率尺度下的图像复原和目标检测问题。研究低分辨率图像复原技术提高图像分辨率质量和增强视觉效果;研究千兆像素超高分辨率图像的目标检测技术,在不降低检测精度的前提下提高目标检测速度;研究提高时间分辨率的视频插帧技术,通过专门检测人体确定感兴趣的区域,消除人物运动视频场景的运动模糊进行人体复原和点云补全算法。论文创新点如下:首先,针对低分辨率成像模组像差和低成本光学镜片带来的噪声和模糊问题,研究欠优化光学系统的图像复原方法。利用点扩散函数作为先验知识,设计深度投影网络优化解卷积的参数,消除了解卷积算法容易出现的振铃效应和伪影,实现了像差矫正和去模糊效果。提出的方法对低分辨率模糊图像的复原效果突出,且能用于计算资源受限的智能终端视觉处理平台。进一步,针对低质量低分辨率图像,提出基于多尺度融合的双重注意力网络图像超分辨算法,同时完成了图像去噪和图像超分辨两个处理任务。其次,针对千兆像素级图像目标检测的精度和速度兼顾难题,设计了多尺度密度回归模块以生成密度图。利用具有高概率的裁剪后补丁,解决了滑动窗口不能大规模变化的固有缺陷,并降低了计算负担。提出了一种基于反向分割技术的异构分辨率检测网络结构,设计低分辨率反向分割模块和极性掩模嵌入和分类模块,策略性地关注重要区域和排除不相关区域。公开数据集 PANDA 上的仿真实验表明,提出的目标检测算法在检测精度和检测速度上均优于目前已公开算法。最后,在时间分辨率尺度上,研究了以检测人体为核心的视频帧插值人体复原和点云补全算法。构建了包含日常场景的高分辨率视频插帧数据集,解决缺乏以人体为中心的高分辨率数据集及标注的问题。设计以人为中心的视频插帧算法,利用关键点引导的细化网、流量估计网和掩蔽注意力融合网实现人体复原。提出的方法在捕捉细微姿势及去除运动模糊方面表现卓越,具有在复杂多变的动态场景中实现高清晰度和准确性的人体复原能力。
Factors that cause image degradation include various situations such as motion blur, noise interference, defocus, and changes in complex environmental conditions. Image restoration technology aims to recover lost detail information from degraded images. Combined with object detection tasks, it can be widely applied in urban security, medical imaging, consumer electronics, and other fields. This dissertation studies image restoration and object detection problems at various spatial and temporal resolution scales. It investigates low-resolution image restoration technology to improve image resolution quality and enhance visual effects; it studies gigapixel ultra-high-resolution image object detection technology to improve the speed of object detection without reducing detection accuracy; and it researches video frame interpolation technology to improve temporalresolution, specifically detecting human bodies to determine the region of interest, and eliminating motion blur in character motion video scenes for human body restoration and point cloud completion algorithms. The innovative points of the dissertation are as follows:Firstly, in response to the noise and blur problems caused by low-resolution imaging modules and low-cost optical lenses, this dissertation studies image restoration methods for sub-optimized optical systems. Using the point spread function as prior knowledge, adeep projection network is designed to optimize the parameters of deconvolution, eliminating the ringing effect and artifacts that are prone to occur in deconvolution algorithms, achieving defocus correction and deblurring effects. The proposed method is outstanding in restoring low-resolution blurry images and can be used on smart terminal visual processing platforms with limited computing resources. Furthermore, for low-quality low-resolution images, a dual attention network image super-resolution algorithm based on multi-scale fusion is proposed, which simultaneously completes the two processing tasks of image denoising and image super-resolution.Secondly, to address the challenge of balancing accuracy and speed in gigapixel-level image object detection, a multi-scale density regression module is designed to generate density maps, using patches with high probability after cropping, which solves the inherent defect of sliding windows not being able to change on a large scale and reduces the computational cost. A heterogeneous resolution detection network structure with reverse segmentation technology is proposed, with a low-resolution reverse segmentation module and a polarity mask embedding and classification module, strategically focusing on important areas and excluding irrelevant areas. Simulation experiments on the public dataset PANDA show that the proposed object detection algorithm is superior to currently published algorithms in terms of detection accuracy and speed.Finally, on the scale of temporal resolution, this dissertation studies video frame interpolation human body restoration and point cloud completion algorithms with a focus on detecting human bodies. A high-resolution video frame interpolation dataset containing daily scenes is constructed to solve the problem of lacking a high-resolution dataset and annotations centered on human bodies. A human-centered video frame interpolationalgorithm is designed, using keypoints-guided refinement networks, flow estimation networks, and masked attention fusion networks to achieve human body restoration. Theproposed method excels in capturing subtle postures and removing motion blur, with the ability to achieve high clarity and accuracy in complex and dynamic scenes.