登录 EN

添加临时用户

基于出租车轨迹的多尺度交通出行行为研究

Multi-scale Modeling of Traffic Travel Behavior based on Taxi Trajectories

作者:丁诗婷
  • 学号
    2020******
  • 学位
    硕士
  • 电子邮箱
    dst******com
  • 答辩日期
    2023.05.14
  • 导师
    李志恒
  • 学科名
    电子信息
  • 页码
    66
  • 保密级别
    公开
  • 培养单位
    599 国际研究生院
  • 中文关键词
    出租车轨迹,轨迹挖掘,顺序模式,多尺度,轨迹聚类
  • 英文关键词
    taxi trajectories,trajectory mining,sequential pattern,multi-scale,trajectory clustering

摘要

位置信息设备对移动物体进行追踪和记录产生大量的轨迹数据,是智能交通系统中重要的数据源之一。利用轨迹数据进行出行行为建模是近年来智能交通系统领域中的热潮,广泛应用在轨迹预测、兴趣区域发现、交通拥堵等方面。交通出行行为建模常基于聚类得到的宏观层面的轨迹分布。然而,轨迹数据的特点限制了聚类结果的有效性。此外,聚类是针对整条轨迹,容易忽略轨迹中的局部信息如多条轨迹之间的相似处。在多条轨迹中,高重复性的路径意味该路径在特定时刻被大部分人群所选择。这类重复路径也称顺序模式(Sequential Pattern),是指一定数量的移动物体在类似的时间段中以相似的顺序在地点中移动,其挖掘的过程称为顺序模式挖掘(Sequential Pattern Mining)。顺序模式挖掘能够有效提取轨迹数据中的出行行为模式,是一种中观层面的分析,能够挖掘不同热点区域之间的相关性。因此,本文提出了一种基于车辆轨迹数据的多尺度时空交通出行行为模式建模的框架。该框架通过结合宏观层面时空分布和中观层面的轨迹分布以挖掘不同尺度下的交通出行行为特征。 为了解决针对车辆轨迹的顺序模式挖掘所遇到的模糊性、不确定性和冗余模式输出的问题,本文提出一个车辆轨迹数据挖掘的数据处理框架。该框架基于地图分割和映射,并将数据离散化。处理过程中同时减少模式挖掘结果中的冗余模式。针对中观层的模式挖掘,本文挑选八个具有代表性的算法,涵盖了不同的搜索方式、数据库形式和约束形式,进行对比实验来比较它们之间输出的模式数量、内存消耗和运行时间。最后,本文分析并验证了连续约束对于车辆轨迹的有效性。针对宏观层面的轨迹挖掘,本文采用时空聚类方法ST-DBSCAN。为了解决庞大数据点量导聚类算法无法有效聚类,在聚类中加入顺序模式挖掘的结果作为局部轨迹信息。聚类则以轮廓系数作为指标,利用k-近邻方法和超参数搜索来寻找最适当的参数。 本文提出的框架突破传统的单一尺度的视角,得到更全面且精细的行为建模。宏观层面的聚类加入局部频繁轨迹信息而得到有效的宏观层面的出行行为。在中观层面的顺序模式挖掘的算法上,通过实验验证了连续约束对于轨迹数据能够以更少的模式数量精简地代表了完整结果集中的模式,在时间和内存消耗上也优于无约束算法。最后,将得到的结果在北京市地图进行可视化分析和方法验证。

Location information devices track moving objects, generating a vast amount of trajectory data, which is an important data sources in the field of Intelligent Transportation Systems (ITS). Pattern discovery from trajectory data is useful in trajectory prediction, finding regions of interest, and traffic congestion researches, making it a popular research topic in recent years. The macro-level travel distribution used in traffic travel behaviour modelling is commonly performed using clustering methods. However, clustering methods often fail to find clusters effectively due to the nature of trajectory data. As clustering is usually performed on entire trajectories, local features such as similar segments in trajectories, are not well-detected. A highly repetitive paths among trajectories mean that such path are chosen by most of the traveller at a particular moment. One way of extracting highly repetitive trajectory routes, or frequent trajectories, is through Sequential Pattern Mining (SPM). A sequential pattern is a certain number of moving objects moving in a similar order among locations in a similar time period. This meso-level analysis of sequential patterns can be used to uncover correlations between different hotspot areas, etc. Therefore, this paper proposes a multi-scale spatio-temporal traffic travel behavior patterns modelling framework based on massive taxi trajectory data. The proposed framework aims to discover traffic travel behavior features at different scales by combining global spatio-temporal distribution and dynamic trajectory distribution at the mesoscopic level. Trajectories are first undergo segmentation and simplification, as well as OD extraction, to resolve the difficulties present in the trajectory data. Subsequently, sequential pattern mining algorithm uncovering the frequent trajectory patterns at meso-level based on the law of association, and the spatio-temporal feature clustering analysis at the macro-level are combined to model multi-scale traffic travel behavior. To address the problems of ambiguity, uncertainty, and redundant pattern output encountered in the sequential pattern mining problem for vehicle trajectories, this paper proposes a data processing framework. The framework simplifies the trajectory data based on map projection and a grid-based method to perform trajectory segmentation, while discretizing and extracting OD information from trajectories. The data is then suitable for sequential mining algorithms, while processed data reduces the number of redundant patterns in the results. For meso-level trajectory mining, eight typical pattern mining algorithms are selected for comparison experiments. The selected algorithms cover different search methods, database forms, and constraint forms, and the number of pattern outputs, memory consumption, and running time are compared between them. Finally, the results of the continuous constraint-based CM-SPAM and the unconstrained CM-SPADE pattern mining algorithms are compared.For macro-level trajectory mining, the framework uses clustering analysis as spatio-temporal clustering ST-DBSCAN and incorporates the information obtained from sequential pattern mining. The clustering process first uses k-nearest neighbors to search for appropriate $\epsilon$. Then, using silhouette coefficients as metrics to perform hyper-parameter search. Finally, the clustering analysis is visualized based to the best parameters found. The proposed framework extends from the typical single-scale perspective, yielding a more comprehensive and fine-grained modeling of traffic travel behavior. Macro-level clustering is able to obtain effective clustering results because of the inclusion of local frequent trajectory information. On the meso-level, it is experimentally verified that continuous constraints are most suitable for trajectory data during sequential pattern discovery as the result set can represent entire patterns sets in a more concise way with a smaller number of patterns.Continuous-constrained algorithms also outperforms unconstrained algorithms in terms of time and memory consumption. To validate the proposed framework, results are visualized on the map of Beijing, where the dataset is located.