登录 EN

添加临时用户

面向即时配送订单指派的智能优化与预测决策

Intelligent Optimization and Predictive Decision-Making on Order Dispatching for On-Demand Delivery

作者:陈靖方
  • 学号
    2017******
  • 学位
    博士
  • 电子邮箱
    cjf******.cn
  • 答辩日期
    2022.11.22
  • 导师
    王凌
  • 学科名
    控制科学与工程
  • 页码
    121
  • 保密级别
    公开
  • 培养单位
    025 自动化系
  • 中文关键词
    即时配送,订单指派,智能优化,机器学习,预测决策
  • 英文关键词
    on-demand delivery, order dispatching, intelligent optimization, machine learning, predictive decision-making

摘要

即时场景下的配送问题具有大规模、强动态、高时效等特点,要求在极短计算时间内完成优化与决策,而传统的精确算法、进化算法等优化技术难以满足在线决策需求。本学位论文面向美团即时配送平台的订单指派问题,针对多骑手、小单量、大单量和全时段等代表性实际场景,融合机器学习和启发式算法,开展智能优化与预测决策方法研究,旨在通过削减订单骑手匹配空间和优化订单骑手匹配关系提升配送效率和顾客体验。论文综述了即时场景下调度优化的代表性研究进展,面向多种代表性场景通过深入研究在建模与优化决策方法方面取得了如下主要成果:(1)面向骑手数量众多的即时配送场景,通过基于图神经网络的骑手信息建模,设计了学习新单、骑手以及全局信息之间关系的注意力机制,提出了一种候选骑手筛选算法,有助于削减订单骑手匹配空间;进而,提出了一种基于后悔值的订单指派算法,通过减少不合理的匹配关系提升配送性能。(2)面向闲时段下的小单量即时配送场景,设计了多种基于不同优化方向的调度规则,通过构造问题驱动的有效特征和模型结构,凝练基于机器学习的自适应选择调度规则的机制,进而提出了一种多调度规则引导的订单指派算法,通过优化小单量场景下的订单骑手匹配关系提升配送性能。(3)面向高峰期间的大单量即时配送场景,提出了一种“离线学习+在线决策”的求解框架,设计了高效的启发式指派规则和基于问题结构的多邻域搜索算子,通过凝练基于模仿学习的模型辅助优化决策机制,提出了一种专家知识引导的订单指派算法,通过优化大单量场景下的订单骑手匹配关系提升配送性能。(4)面向全时段下的即时配送场景,设计了关键订单骑手关系的解耦策略,设计了一种基于关键订单的强化学习方法,进而提出了一种融合解耦策略和强化学习的订单指派算法,通过优化全时段场景下的匹配关系提升配送性能。论文通过基于大量实际数据的离线实验和在线AB测试以及统计分析,验证了本论文所提算法相比于文献中的代表性算法和美团实际应用算法的有效性。同时,部分所提算法已在全国成功推广应用。

The on-demand delivery problem has the characteristics of large-scale complexity, strong dynamism, and high timeliness, which limits the optimization and decision-making to a very short computation time. It is difficult for traditional methods such as exact algorithms and evolutionary algorithms to meet such online decision-making requirements. For the order dispatching problem in Meituan on-demand delivery platform, this dissertation focuses on typical realistic scenarios with a large number of riders, a small number of orders, a large number of orders, and a different number of orders, respectively, conducts the methodological research on intelligent optimization and predictive decision-making by combining the machine learning methods and heuristics, and aims at improving the delivery efficiency and customer satisfaction via reducing the matching space and optimizing the matching relationship of orders and riders.This dissertation has reviewed the representative state-of-the-art of on-demand delivery, and achieved the following contributions through deep research on the modeling and methods of optimization and decision-making for multiple typical scenarios:(1) For the on-demand delivery scenario with a large number of riders, an algorithm for selecting candidate riders is proposed to reduce the matching space of orders and riders. Specifically, by modeling rider information based on graph neural networks, two attention mechanisms to learn the relationships among orders, riders, and global information are designed. Moreover, an order dispatching algorithm based on regret value is presented by avoiding unreasonable matching relationships to improve delivery performance.(2) For the on-demand delivery scenario with a small number of orders during the off-peak period, a variety of tie-breaking operators are designed based on different optimization directions. Then, an adaptive mechanism of selecting tie-breaking operators based on machine learning is presented by constructing problem-driven useful features and model structures. Thus, an adaptive tie-breaking order dispatching algorithm is proposed, which improves the delivery performance via optimizing the matching relationships of riders and a small number of orders.(3) For the on-demand delivery scenario with a large number of riders during the on-peak period, an “offline learning for online decision-making” resolution framework is proposed. Specifically, an effective heuristic and multiple search operators on problem-specific neighborhoods are designed. Thus, an expertise-guided order dispatching algorithm is proposed by developing a model-assisted optimization and decision-making mechanism based on imitation learning, which improves the delivery performance via optimizing the matching relationships of riders and a large number of orders.(4) For the on-demand delivery scenario with a different number of orders, a decoupling strategy to determine critical orders and riders is designed, and a critical order-based reinforcement learning method is proposed. Thus, an order dispatching algorithm based on the decoupling strategy and reinforcement learning is proposed, which improves the delivery performance via optimizing the matching relationships of riders and orders.By conducting offline experiments and online AB tests on extensive real-world datasets and statistical analysis, the effectiveness of the proposed algorithms is validated compared with the typical algorithms from the literature and the practical algorithms used by Meituan. Meanwhile, some proposed methods have been successfully applied nationwide.