登录 EN

添加临时用户

统一鸟瞰图下的自动驾驶多模态协同感知方法研究

Research on Multi-modal Collaborative Perception Method of Autonomous Driving under Unified Bird ‘s-Eye View

作者:陈潜
  • 学号
    2020******
  • 学位
    硕士
  • 电子邮箱
    248******com
  • 答辩日期
    2024.05.22
  • 导师
    张毅
  • 学科名
    工程管理
  • 页码
    85
  • 保密级别
    公开
  • 培养单位
    025 自动化系
  • 中文关键词
    鸟瞰图;多模态特征融合;多车协同感知;超视野感知;自动驾驶
  • 英文关键词
    Bird’s Eye View; Multimodal-Sensor Feature Fusion; Multi-Vehicle Cooperative Perception; Super View Perception; Autonomous Driving Technology

摘要

自动驾驶系统的感知能力就像是车辆的眼睛一样,面对错综复杂的现实环境,需要稳定且准确地检测每个场景的环境信息。受限于业界相关传感器探测范围的限制,目前以单车智能为主的感知系统,其有效感知距离与准确度上有较大瓶颈,特别是针对一些盲区遮挡与超视距场景。那么引入基于车车互联的协同感知方法,提升这类高危场景的自动驾驶安全性,就有至关重要的技术与社会意义。本文以此为出发点,面向盲区遮挡场景的感知优化,研究了基于鸟瞰图的多模态感知及多车协同感知等核心技术,发现目前研究的内容依然存在可优化的空间和问题点。首先,针对单一模态传感器的探测距离物理限制,设计了一个新型的融合模块,将多模态传感器在统一的鸟瞰图空间上进行特征融合,大幅降低融合信息丢失与融合难度。同时在融合模块中加入模态失效引导图,引入动态融合权重,优化了常规静态平均融合方式无法赋予在不同工况下的核心传感器动态权重的问题,并且在单一模态失效与正常场景下的融合效果都有不错的收益。第二,针对障碍物进入盲区的短时遮挡场景,引入了继承式的时序特征融合,结合基于查询的时序检测头,不仅提升了局部与短时遮挡场景的检测性能,同时也提升了检测类别、尺寸、朝向等检测属性的平滑性。第三,针对一些长时遮挡有超视距问题,采用多车协同感知的方法进行优化。提出了在统一鸟瞰图空间进行多车特征时空融合感知方法,不仅解决了常规多车协同感知的信息丢失问题,又能满足当前通信带宽与延迟的要求。最后,为了验证整个研究的有效性,将上述的方法实际部署到了一台真实的自动驾驶车辆上面,在一个封闭的园区里面进行了盲区检测的实验。通过实验结果的分析,可以清晰看到无论单车多模态时序融合方法还是多车协同感知方法,在实验中针对遮挡障碍物的检测都有非常明显的检测性能提升。本文通过研究,将以上研究结合实际的自动驾驶场景,进行了方案的落地与数据验证,为提升自动驾驶感知系统的有效检测距离提供了新的思路与方法。

The perception capability of an autonomous driving system, akin to the eyes of a vehicle, must stably and accurately detect environmental information in complex real-world scenarios. Currently, perception systems relying primarily on single-vehicle intelligence face significant bottlenecks in effective sensing range and accuracy due to the limitations of industry-related sensors, especially in scenarios involving blind spot occlusions and beyond-visual-range situations. Introducing collaborative perception methods based on vehicle-to-vehicle connectivity to enhance the safety of autonomous driving in such high-risk scenarios thus holds crucial technical and societal significance. With this as a starting point, this paper focuses on perception optimization for blind spot occlusion scenarios. It explores core technologies such as multi-modal perception based on bird‘s eye view and multi-vehicle collaborative perception, revealing opportunities for further optimization and unresolved challenges in current research.Firstly, addressing the physical limitations of single-modal sensors‘ detection range, a novel fusion module is designed to fuse features from multi-modal sensors in a unified bird‘s eye view space, significantly reducing information loss and fusion complexity. The introduction of a modal failure guidance map within this fusion module incorporates dynamic fusion weights, optimizing the issue where conventional static average fusion methods cannot assign dynamic weights to core sensors under different operating conditions. This approach yields beneficial outcomes in fusion performance, both when a single modality fails and in normal scenarios.Secondly, for short-term occlusion scenarios where obstacles enter blind spots, inherited temporal feature fusion is introduced, combined with a temporal Query-Base detection head. This not only enhances detection performance in local and short-term occlusion scenarios but also improves the smoothness of detection attributes such as category, size, and orientation.Thirdly, to address long-term occlusions and beyond-visual-range issues, a multi-vehicle collaborative perception approach is employed for optimization. A method for spatio-temporal fusion of multi-vehicle features in a unified bird‘s eye view space is proposed. This not only mitigates information loss in conventional multi-vehicle collaborative perception but also meets current requirements for communication bandwidth and latency.Finally, to validate the effectiveness of the entire study, the aforementioned methods are practically deployed on a real autonomous vehicle and tested in a closed-off parking lot for blind spot detection. Analysis of the experimental results clearly demonstrates significant improvements in detection performance for occluded obstacles, both with the single-vehicle multi-modal temporal fusion method and the multi-vehicle collaborative perception approach.Through this research, the integration of the above studies with practical autonomous driving scenarios has been achieved, along with the implementation and data validation of solutions. This provides new ideas and methods for enhancing the effective detection range of autonomous driving perception systems.