视觉同时定位与地图构建(Visual Simultaneous Localization and Mapping, VSLAM) 即基于视觉传感器数据在未知环境下估计机器人位姿并重建环境场景,对这种不依赖于GPS的导航方法的研究是目前自主无人系统领域的研究热点。本论文研究融合其他传感器数据的单目VSLAM 方法,解决了单目VSLAM 所存在的尺度不确定等问题,并在实际系统上进行了实验验证。论文主要工作包括:1) 提出了一种室内平面移动机器人单目VSLAM 方法,该方法在基于优化的六自由度SLAM 算法框架中将单目图像与轮式里程计数据相融合,通过设置雅可比矩阵将机器人位姿估计结果约束在平面内;此外,通过创建子地图和合并子地图功能解决在复杂环境中视觉跟踪可能丢失的问题。实验结果表明,该方法可实现室内平面移动机器人的准确定位,即使在复杂环境下也可保证所建地图的全局一致性。2) 对基于优化的融合单目图像与惯性测量单元(Inertial Measurement Unit, IMU) 数据的紧耦合视觉惯性导航算法进行了软件实现并将代码开源,在公开数据集上的测试结果表明,该方法在公开数据集的30平方米和300平方米的场景中均能达到约10cm的定位精度。此外,提出了基于逆深度参数化进行局部地图优化的改进方法来提高数值稳定性,室外实验结果表明,改进后的方法更适合室外大尺度场景,同时可以保证相近甚至更高的定位精度。3) 搭建了一套单目相机-IMU组合的视觉惯性导航系统,解决了相机与IMU 传感器时间同步问题,可以满足视觉惯性导航算法的时间同步要求。在室内外手持场景和室外无人机飞行场景等实验环境下进行了多组实验,实验结果表明,该软硬件综合的视觉惯性导航系统在轨迹长度约550m的手持绕建筑行走场景中回到起始点的定位误差在10m以内、20x20x20m范围内的无人机飞行场景中三轴定位误差在1m以内,验证了该系统的定位效果。
Visual Simultaneous Localization and Mapping (VSLAM) can estimate robot pose and landmark positions based on camera's images in unknown environments. Such localization method for autonomous systems in GPS-denied environments attracted much attention in recent years. This thesis studies monocular VSLAM methods with fusion of other sensors to deal with the disadvantages of monocular VSLAM such as scale ambiguity, and conducts experiments to validate the methods. The main work includes: 1) A monocular VSLAM method for indoor planar robots is proposed. This method fuses monocular image and wheel odometry data in 6-degree-of-freedom optimization based SLAM framework. By setting Jacobian matrix and using Levenberg method, the method constrains the estimated poses of robot in the ground plane. What's more, the method creates and merges sub-maps to deal with the tracking failure problem in complicated environments. Experiments with indoor robots shows that, the method can achieve accurate localization results, and the constructed map is consistent in complicated environments. 2) An optimization based tightly-coupled visual inertial navigation algorithm is implemented to fuse monocular image and IMU data, and we make the code open source. The implemented software is tested with an open dataset, and the results show that the localization accuracy is within about 10cm in the scenes with size of about 30m^2 and 300m^2 . Based on the visual inertial navigation algorithm, we propose a method with inverse depth parameterization for landmarks to improve numerical stability. The outdoor experiment result shows that the improved method performs better in numerical stability in large scale environment with similar or even higher localization accuracy. 3) We design and build a visual inertial navigation system with commercial off-the-shelf sensors. In the system the monocular camera and IMU data is temporal synchronized to fulfill the temporal synchronization requirements of visual inertial navigation algorithm. Multiple experiments are conducted in indoor and outdoor environments. In the experiments, the localization error is within 10m when getting back to start position after 550m handheld motion around a building, and the three-axis position errors are within 1m in 20x20x20m scenes during the flight of a UAV. These experiment results demonstrate the effectiveness of our visual inertial navigation system.