可见光遥感图像目标检测与识别的任务是在图像中定位目标位置并判定目标型号,是遥感图像处理和分析领域的研究热点和难点。本文针对可见光遥感图像中的飞机和靠岸舰船目标,重点从候选目标提取、检测特征学习、识别特征学习三个方面开展检测与识别方法研究,取得的主要成果如下。提出由粗到精的目标检测方法。设计由粗到精的检测框架和用于分类的卷积神经网络,分两步分别实现候选目标提取和目标检测。在第一步中,利用大尺度滑动窗在大幅面图像中裁剪区域,利用分类网络进行分类后,得到可能包含目标的候选区域;在候选区域中利用小尺度滑动窗裁剪得到候选目标。在第二步中,利用分类网络对候选目标进行分类,实现目标检测。其中,为了利用倾斜包围框对靠岸舰船进行准确定位,提出一种基于主方向检测的倾斜包围框生成方法。实验结果表明,该方法提取的候选目标数量比同时期方法减少50%以上,检测精度超过同时期方法。提出基于深度特征联合的目标检测方法。受自然图像目标检测方法启发,研究不需要候选目标提取的目标检测方法。以性能优异的卷积神经网络为骨干网络完成检测网络设计。检测网络以候选区域为输入,利用特征图中的包围框,直接实现目标检测,无需显式地进行候选目标提取。在目标检测时,针对目标的不同尺寸,联合网络的多个卷积层的特征对目标的位置和尺寸进行预测,以便提高检测精度。相比本文提出的由粗到精的目标检测方法,该方法的检测精度得到进一步提升,且超过同时期方法。提出由简单到复杂的目标识别方法。模拟图像分析人员的目标识别过程,设计由简单到复杂的识别框架,首先利用结构相对简单的卷积神经网络执行第一步识别,然后利用结构相对复杂的卷积神经网络执行第二步识别。提出由简单到复杂的两步识别方法。在第一步中,利用结构相对简单的卷积神经网络的全连接特征和软-最大分类器,进行初步识别;在第二步中,对于置信度较低的识别结果,利用结构相对复杂的卷积神经网络的全连接特征和编码后的卷积特征,完成最终识别。该方法的识别精度优于同时期方法,且可扩展应用于可见光遥感图像场景分类领域,场景分类精度与同时期方法相当。
Object detection and recognition in optical remote sensing images, which aims to locate the positions and determine the types of objects in images, is a hot and difficult research topic in the field of remote sensing image processing and analysis. In this paper, we study the detection and recognition method for aircraft and inshore ship objects, focusing on candidate object proposal, detection feature learning and recognition feature learning. The main achievements are as follows.First, a coarse-to-fine object detection method is proposed. We design a coarse-to-fine detection framework and a classification convolutional neural network, proposing candidate objects and detecting objects in two steps. In the first step, a large-scale sliding window is used to crop image regions in the large-format image, and the classification network is used to classify the image regions to get candidate regions, which may contain aircraft or inshore ship objects. Using a small-scale sliding window, candidate objects are cropped from the candidate regions. In the second step, the classification network is exploited to classify the candidate objects to obtain detected objects. In order to accurately locate inshore ships by inclined bounding boxes, we propose an inclined bounding box generation method based on main direction detection. Thorough experiments demonstrate that the number of candidate objects proposed by our method is reduced by more than 50% compared with the existing methods, and the detection accuracy exceeds the state-of-the-art methods.Then, an object detection method based on deep feature combination is proposed. Inspired by object detection in natural images, we study object detection method that does not require candidate object proposal. A convolutional neural network with high performance is exploited as backbone network to construct the detection network. The detection network takes candidate regions as inputs, and uses bounding boxes in the feature map to directly detect objects without explicit candidate object proposal. In order to improve the detection accuracy, we combine the features of multiple convolutional layers to predict the positions and sizes of objects, according to the different sizes of objects. Compared with our coarse-to-fine detection method, the detection accuracy of the deep feature combination based method is further improved, and surpasses the state-of-the-art methods.Finally, a simple-to-complex object recognition method is proposed. We design a simple-to-complex recognition framework, simulating the recognition process of image analysts. A convolutional neural network with a simple structure is used for the simple recognition, and a convolutional neural network with a complex structure is used for the complex recognition. Then we propose a two-step recognition method. In the first step, preliminary recognition is performed, using the fully-connected features of the simple convolutional neural network and a soft-max classifier. In the second step, for the recognition results with low confidences, the fully-connected features and encoded convolutional features of the complex convolutional neural network are utilized to perform the final recognition. The recognition accuracy of our method surpasses the state-of-the-art methods. Moreover, our method is extended to remote sensing scene classification, achieving comparable accuracy with the state-of-the-art methods.