自动驾驶是近来人工智能领域的研究热点之一。目前,许多公司和研究机构都在研究和开发自动驾驶技术。端到端自动驾驶是其中的一种,它使用神经网络将传感器数据作为输入并输出控制车辆行驶的相关参数如加速度、制动力和方向盘转角等。本文围绕端到端自动驾驶开展研究,重点探索并提出了一种基于无监督分层强化学习的自动驾驶方法。论文主要研究工作如下:1.构建了基于CARLA仿真平台的无监督分层强化学习自动驾驶研究环境,为后续的实验开展提供了便利。其中包括各种场景的实现、数据的生成和标注,数据可用于鸟瞰图融合感知。为实现数据融合,通过坐标变换对不同传感器坐标进行了统一。2.提出了一种无监督自动驾驶技能学习方法。使用基于信息熵的无监督技能学习,在CARLA仿真平台上对不同的状态空间进行了讨论。由于无需人工设计奖励,因此可以训练不同的技能来应对复杂的驾驶环境。考虑自动驾驶车辆状态连续变化,通过使用Transformer模型学习对应的技能,一定程度上改善了端到端自动驾驶的可解释性。3.基于分层强化学习架构设计了一个上层meta控制器实现技能选择控制,完成了强化学习训练中状态空间、动作空间、环境反馈奖励以及终止条件四个要素的构建。基于CARLA仿真环境完成了训练并进行了实验,初步结果表明论文提出的方法在训练效率和可解释性方面得到了一定改善。论文最后对未来可能的改进方向提出了相应建议。
Autonomous driving is one of the hot research topics in the field of artificial intelligence. Currently, many companies and research institutions are studying and developing autonomous driving technology. End-to-end autonomous driving is one of them, which uses neural networks to take sensor data as input and output relevant parameters such as acceleration, braking force, and steering angle to control the vehicle. This dissertation focuses on end-to-end autonomous driving and proposes an autonomous driving method based on unsupervised hierarchical reinforcement learning. The main research work dissertation is as follows:A research environment for unsupervised hierarchical reinforcement learning for autonomous driving is constructed based on the CARLA simulation platform, which provides convenience for subsequent experiments. The environment includes the implementation of various scenarios, data generation and annotation. The generate data can be used for bird‘s eye view based fusion perception. To achieve data fusion, coordinate transformation is used to unify the coordinates of different sensors.A method of unsupervised autonomous driving skill learning is proposed. Based on the information entropy-based unsupervised skill learning, different state spaces are discussed on the CARLA simulation platform. Since there is no need for manual reward design, different skills can be trained to deal with complex driving environments. Considering the continuous changes in the state of autonomous driving vehicle states, the use of a Transformer model to learn corresponding skills has improved the interpretability of end-to-end autonomous driving to some extent.Based on the hierarchical reinforcement learning architecture, an upper-level meta-controller for skill selection is designed, and a city road scene containing background vehicles and pedestrians is generated based on the CARLA simulation platform as the training environment. The construction of the state space, action space, environmental feedback reward, and termination condition in reinforcement learning training is completed. Preliminary experimental results show that the proposed method has improved training efficiency, and interpretability.Finally, the suggestions for possible future improvements are provided.