基于深度强化学习的车辆路径优化关键技术及应用研究

Research on Key Technologies and Applications of Vehicle Routing Optimization Based on Deep Reinforcement Learning

作者：陈锦炜

学号

2020******
学位

硕士
电子邮箱

cjw******.cn
答辩日期

2023.05.17
导师

李勇
学科名

电子信息
页码

74
保密级别

公开
培养单位

023 电子系

中文关键词

车辆路径问题,定制公交路线问题,快递员路线问题,深度强化学习,图神经网络

英文关键词

Vehicle Routing Problem, Customized Bus Routing Problem, Courier route problem, Deep reinforcement learning, Graph neural network

摘要

交通领域和物流领域内的需求响应式服务，通常需要调配车辆或快递员将乘客或货物送往特定的地点。合理规划车辆服务路线不仅能够降低运营成本，还可以为客户提供高品质的服务。车辆路径问题作为一类组合优化问题，传统方法存在有求解规模受限、求解时间较长、容易陷入局部最优等不足。为了克服传统方法的局限性，利用深度强化学习算法来实现车辆路径的优化成为新的技术途径。然而，由于任务节点间存在复杂依赖、服务对象的需求复杂多样等挑战，现有研究尚未充分探索深度强化学习在交通、物流领域内具体车辆路径问题的应用。为了解决上述挑战，本文针对带混合派送和揽收的车辆路径问题、定制公交路线问题、物流终端配送场景下快递员路线设计问题这三个物流和交通领域的典型问题，提出了一系列车辆路径优化关键技术。本文的创新点和贡献总结如下：第一，针对带混合派送和揽收的车辆路径问题，提出一种基于图神经网络和注意力机制的编码器-解码器网络来生成问题的解。同时，针对该问题还提出了一种装载率和路径协同决策机制，从而避免了车辆装载率设置不当对路径策略的影响。在真实数据集和随机生成的数据集上的实验结果表明，所提出的方法较基线算法性能平均提升14.12%，且在求解速度方面至少快于启发式算法1.92倍。第二，针对定制公交路线问题，设计了一种基于图注意力网络的编码器来提取路网距离矩阵和行驶时间矩阵的信息，并使用基于注意力机制的解码器来逐步生成定制公交的路线。在真实数据集上开展的实验表明，所提出的定制公交路线设计算法相比于基线方法性能平均提升5.48%，且在求解速度方面优于启发式算法。第三，针对物流终端配送场景，提出了一种考虑现实约束条件的聚类算法来对物流订单进行聚类，并在此基础上将快递员路线问题建模为带时间窗的多仓库的同时揽收和派送的车辆路径问题，进而使用深度强化学习来求解。在真实数据集上开展的实验表明，所提出的快递员路线设计算法相比于基线方法性能平均提升10.60%，且在求解速度方面至少快于启发式算法6.74倍。

Demand response services in the field of transportation and logistics usually involve deploying vehicles or couriers to transport passengers or goods to a specific location. Reasonable planning of vehicle service routes can not only reduce operating costs, but also provide customers with high-quality services. As a kind of combinatorial optimization problem, the traditional methods of vehicle routing problem has some shortcomings, such as limited solution scale, long solution time and easy to fall into local optimum. In order to overcome the limitations of traditional methods, deep reinforcement learning algorithm has become a new technology approach to realize vehicle routing optimization. However, due to challenges such as complex dependency between task nodes and complex and diverse needs of service objects, the application of deep reinforcement learning to specific vehicle routing problems in the field of transportation and logistics has not been fully explored in existing studies. In order to solve the above challenges, this thesis proposes a series of key technologies for vehicle routing optimization aiming at three typical problems in logistics and transportation fields: Vehicle Routing Problem with Mixed Delivery and Pick-up, customized bus routing problem, and courier routing design problem under logistics terminal distribution scenario. The innovations and contributions of this thesis are summarized as follows:Firstly, a novel encod-decoder network based on graph neural network and attention mechanism is proposed to solve Vehicle Routing Problem with Mixed Delivery and Pick-up. In order to avoid the influence of improper setting of vehicle loading rate on route strategy, a Coordinated Decision of Loading and Routing Mechanism is proposed. The experimental results on real data sets and randomly generated data sets show that the proposed method improves the performance of the baseline algorithm by 14.12% on average, and the solution speed is at least 1.92 times faster than the heuristic algorithm.Secondly, to solve the customized bus routing problem, an encoder based on graph attention network is designed to extract the information of road network distance matrix and travel time matrix, and the decoder based on attention mechanism is used to gradually generate customized bus routes. Experiments conducted on real data sets show that the proposed customized bus route design algorithm has an average performance improvement of 5.48% compared with the baseline method, and is superior to the heuristic algorithm in solving speed.Thirdly, aiming at the logistics terminal distribution scenario, a clustering algorithm considering realistic constraints is proposed to cluster logistics orders. On this basis, the courier routing problem is modeled as Multi-Depot Vehicle Routing Problem with Simultaneous Pick-up and Delivery with Time Windows, and then solved by deep reinforcement learning. Experiments on real data sets show that the performance of the proposed Courier route design algorithm is improved by 10.60% on average compared with the baseline method, and the solution speed is at least 6.74 times faster than the heuristic algorithm.

概览页

基于深度强化学习的车辆路径优化关键技术及应用研究

Research on Key Technologies and Applications of Vehicle Routing Optimization Based on Deep Reinforcement Learning

摘要

请选择登录入口

添加临时用户

概览页

基于深度强化学习的车辆路径优化关键技术及应用研究

Research on Key Technologies and Applications of Vehicle Routing Optimization Based on Deep Reinforcement Learning

摘要

国内学位论文

国外学位论文

请选择登录入口