登录 EN

添加临时用户

基于深度强化学习的轨道交通车流与客流一体化优化方法

Integrated Optimization of Train and Passenger Flows for Urban Rail Transit Systems Based on Deep Reinforcement Learning

作者:宁心怡
  • 学号
    2021******
  • 学位
    硕士
  • 电子邮箱
    nin******com
  • 答辩日期
    2024.05.17
  • 导师
    吉吟东
  • 学科名
    控制科学与工程
  • 页码
    95
  • 保密级别
    公开
  • 培养单位
    025 自动化系
  • 中文关键词
    城市轨道交通;深度强化学习;一体化优化;在线优化;多目标强化学习
  • 英文关键词
    Urban rail transit system;Deep reinforcement learning;Integrated optimization;Online optimization;Multi-objective reinforcement learning

摘要

近年来,城市轨道交通在缓解道路拥堵和方便乘客出行方面发挥了重要作用,提高综合能效与乘客服务质量成为了关键的发展方向。然而,目前客流需求和列车调度间仍缺乏有效的在线双向匹配,城市轨道交通子系统之间存在信息壁垒,整个系统在一体化优化方面有较大的提升空间。针对这些问题,本文研究了基于深度强化学习的城市轨道交通系统车流与客流一体化优化方法,主要内容如下:(1)面向车流和客流的一体化优化需求,设计并实现了城市轨道交通系统的仿真模型。在此基础上,进一步采用多层感知机对列车速度曲线与动力学仿真过程进行了加速,在保证仿真精度的同时,为提高后续一体化优化的求解效率奠定了基础。(2)针对路网列车调度和客流引导的协同在线优化问题,提出了一种基于多智能体强化学习的优化方法。该方法基于马尔科夫决策过程框架分别对列车调度和客流引导进行了建模,并应用多智能体强化学习算法求解,能够在客流动态变化时在线调整时刻表与客流引导策略,满足在线优化的需求。基于重庆地铁数据的仿真结果表明了该方法的有效性和在线计算效率,能够有效减小全天牵引供电系统净能耗,缩短乘客平均等待时间。(3)针对列车速度曲线、列车调度与客流引导的一体化优化问题,提出了一种基于多目标多智能体强化学习的优化方法,在研究内容(2)的基础上,引入了列车速度曲线优化,同时在列车调度中加入了虚拟编组优化。该方法设计了一种分层优化框架,并针对列车调度问题提出了一种多目标多智能体强化学习求解方法,得到了帕累托前沿解集。基于重庆地铁数据的仿真结果表明了该方法的有效性,能在较短的训练时间内实现更有效的帕累托前沿优化,进一步减小了城市轨道交通系统净能耗和乘客平均等待时间,提高了系统的整体性能。(4)使用城市轨道交通系统仿真平台对各优化方案进行了验证。通过供电系统仿真分析了直流侧牵引供电系统的能量流走向与优化方案取得的能效,同时引入了更多评价指标,对优化方案性能进行了全面的评估。仿真结果验证了本文优化方案的有效性。

In recent years, urban rail transit has played a significant role in alleviating traffic congestion and facilitating passenger travel. Enhancing overall energy efficiency and the quality of passenger services has become a key direction for development. However, there is still a lack of effective online matching between passenger flow demand and train scheduling. Furthermore, information barriers exist between urban rail transit subsystems, leaving room for further improvement. To address these issues, this thesis explores the integrated optimization of train and passenger flows for urban rail transit systems based on deep reinforcement learning. The main contributions include:(1) Simulation models of the urban rail transit system were designed and implemented, meeting the demands of integrated optimization for train and passenger flows. Building on these models, a multilayer perceptron is further adopted to accelerate the simulation process of train speed curves and train dynamics. This approach laid the foundation for enhanced calculation efficiency in subsequent integrated optimizations while ensuring simulation accuracy.(2) To address the online collaborative optimization problem of network train scheduling and passenger flow assignment, a multi-agent reinforcement learning method was proposed. This method models both train scheduling and passenger flow assignment based on Markov decision processes and employs a multi-agent reinforcement learning algorithm for solutions. It allows for online adjustments to timetables and passenger flow assignment strategies in response to changing passenger dynamics. Simulation results based on Chongqing Metro data demonstrate the effectiveness and computational efficiency of the method, which can reduce both the full-day net energy consumption of the traction power supply system and the average passenger waiting time.(3) For the integrated optimization problem involving train speed curves, train scheduling, and passenger flow assignment, a multi-objective multi-agent reinforcement learning method was introduced. Building on the work described in (2), this method integrates train speed curve optimization and introduces virtual train formation optimization into train scheduling. A hierarchical optimization framework was designed, and a multi-objective multi-agent reinforcement learning method for train scheduling was proposed, generating a set of Pareto frontier solutions. Simulation results based on Chongqing Metro data demonstrate the effectiveness of this method, which achieves more effective Pareto frontier optimization in a shorter training time, thereby further reducing the net energy consumption of the urban rail transit system and the average passenger waiting time, enhancing the overall performance of the system.(4) A simulation platform for the urban rail transit system was used to verify different optimization plans. The energy flow of the DC traction power supply system and the energy efficiency of these plans were analyzed through power system simulations, and additional evaluation metrics were introduced to provide a comprehensive assessment of the performance of these plans. Simulation results verify the effectiveness of the optimization plans proposed in this thesis.