自动驾驶汽车是缓解交通拥堵,提高行驶安全性,改善燃油经济性的有效技术路径。轨迹跟踪是自动驾驶汽车运动控制的核心技术,是实现车辆智能、安全、平稳行驶的基础。然而轨迹跟踪是一个典型的“多目标、非线性、多约束”的预测控制问题。现有研究大多是在“规划+跟踪”运动框架下基于模型预测控制跟踪轨迹,仍然存在控制精度低,实时性差等问题。近似动态规划算法采用离线训练最优策略在线求解的思想,可将繁重的优化计算过程转移到离线训练阶段,是一种高效的预测控制问题求解算法。为此,本文以自动驾驶汽车为研究对象,开展基于近似动态规划的轨迹跟踪研究,以期解决现有轨迹跟踪算法存在的控制精度低和实时性差的问题,主要研究内容和创新概括如下:首先,采用魔术公式轮胎模型构建高精度非线性车辆动力学模型;利用数值更加稳定的反向欧拉方法对模型进行离散化,将横向车辆轨迹跟踪控制构造为无限时域带约束的预测型最优控制问题;在此基础上,基于模型预测控制构建以跟踪误差和控制输入为目标函数的轨迹跟踪控制器。其次,设计出在线计算效率高的近似动态规划算法,其采用全连接神经网络作为参数化函数逼近控制策略,将在线优化问题转化为神经网络参数的离线解。为快速收敛到最优策略,设计具有两层隐藏层的全连接神经网络,选取指数线性激活函数更新神经网络参数,并采用梯度下降法训练。此外,利用贝尔曼最优性原理证明了所设计的近似动态规划算法的收敛性和稳定性。最后,为验证所设计的近似动态规划算法的有效性,以模型预测控制算法为对比对象,测试其在双车道换道,正弦轨迹跟踪以及阶跃响应等场景下的性能表现。测试结果表明,近似动态规划算法与模型预测控制算法具有相似的跟踪性能,平均横向位置误差可达到1 cm;但单步计算时间仅为 0.5 ms,比预测时域为25步的模型预测控制算法快150倍。因此,本文所提出的近似动态规划控制算法不仅具有良好的跟踪性能而且具有较高的实时性,可用于高实时的自动驾驶汽车轨迹跟踪控制器设计。
Autonomous driving is a crucial technical path to alleviate traffic congestion and improve driving safety and fuel economy. The trajectory tracking module is one of the core technologies of autonomous vehicles, and it is the basis for realizing intelligent, safe, and stable driving of vehicles. However, most existing research methods still have some problems, such as limited computational speed and tracking accuracy. Model predictive control (MPC) approach is widely used in this field, but the difficulty of the solution makes it unable to meet the computing requirements of the onboard controllers and has a poor real-time performance. Given the above challenges, taking the trajectory tracking of the autonomous vehicle as the research object, this thesis proposes a fast solution algorithm for predictive control problems with real-time online high computing efficiency using approximate dynamic programming (ADP) algorithm, which uses the idea of training the optimal policy offline and then implements it online, thus moving the heavy optimization calculation burden to the offline stage and dramatically improves the real-time performance. The main research contents and innovations are summarized as follows:Firstly, to lay the foundation for the follow-up work of this thesis, the nonlinear vehicle dynamics model of the trajectory tracking control has been constructed using the magic formula tire model for better model accuracy. The numerically stable backward Euler method is then used for the model discretization. The lateral vehicle trajectory tracking control is constructed as a predictive optimal control problem. The trajectory tracking controller based on MPC was then built with the objective function integrating tracking error and control input. Furthermore, the constraints of vehicle states were designed, including lateral acceleration, vehicle slip angle and control inputs, to achieve safe and stable trajectory tracking. Secondly, the ADP algorithm with high real-time online computing efficiency is designed. The algorithm uses fully connected neural networks (NNs) as parametrized functions to approximate the optimal control policy. The training environment, the NNs structure and hyperparameters, and the updating rules for the NNs parameters are designed. The online optimization problem is then transformed into the offline solution of the NNs parameters. The network parameters are updated by the gradient descent method, which finally converges to the optimal policy. Lastly, the algorithm's convergence is proved using the Bellman principle of optimality.Finally, to verify the effectiveness of the designed MPC and ADP algorithms, the simulation test was implemented with different road test scenarios like sine wave curve, compounded sine wave curve, and double lane change. The ADP algorithm is then compared with the MPC algorithm to verify their performance efficiency. The results show that the ADP algorithm performs better in most simulation scenarios and steps. For example, the mean lateral position error reached as low as 1 cm and the single-step calculation time was, in some situations, 0.5 ms, which is about 150 times faster than MPC with 25 prediction steps. The relative difference with the MPC controller also reached as low as 1%, which proves that the performance of the ADP algorithm is similar and sometimes also outperforms MPC. Therefore, it can be ensured that the proposed ADP control algorithm has good tracking performance and high real-time online calculation and can be used efficiently as the autonomous vehicle trajectory tracking controller.