移动通信网络是现代社会运行发展的重要基础设施,其中传输的海量网络流量数据隐含丰富信息,蕴含重要价值。为了支撑数字孪生网络等重大应用,需要在用户个体、基站聚合等不同层面精准模拟网络流量动态变化。当前,获取移动现网真实流量数据遇到规模与质量受限等挑战,基于少量现网数据生成大规模、高真实度、细时间粒度网络流量数据成为重要研究课题。该研究面临流量行为动态多变、用户产生流量动态多样、环境对聚合流量影响难以建模三点严峻挑战。针对以上挑战,本文围绕移动网络流量分析与设备识别、个体层面的手机与物联网用户流量生成、聚合层面的基站流量生成四个关键问题展开研究,主要创新点和贡献如下:第一,在流量分析与设备识别方面,本文首先在网络基本特性、空间移动和时域活动三个方面系统性分析流量行为特征,通过所提取的特征识别智能手机和物联网设备,准确率超过95%;进一步从数据包、数据流、TCP连接三个层面深入挖掘流量行为特征,提出基于信息增益的代表性特征筛选方法,有效识别不同物联网设备类型,识别准确性超过90%。第二,在手机用户流量生成方面,本文引入用户空间移动和应用使用信息,针对流量的多时间尺度变化特征,实现了对手机用户流量短时间尺度上多变的流量模式、长时间尺度上各异的模式切换行为和流量之间的类别关联的建模;进一步,通过设计多尺度多层次生成模型,基于少量现网数据生成大规模、细时间粒度流量,在流量分布真实度、变化周期性等方面的性能和数据应用效果显著优于现有方法。第三,在物联网用户流量生成方面,本文通过移动网络数据的语义信息挖掘技术,构建网络知识图谱,实现了对物联网用户流量的多影响因素的建模;进一步提出了网络知识增强的流量生成模型,通过联合建模网络知识、终端设备类型和流量变化特点,实现了物联网用户终端设备类型与流量的协同生成。所生成数据的设备类型分布与流量分布真实度等方面的性能指标、数据应用效果均优于现有方法。第四,在基站聚合流量生成方面,本文通过城市知识图谱建模基站覆盖区域的环境,提出了城市知识增强的移动网络流量生成模型,依次建模流量的不同尺度周期性变化模式和非周期性波动,在流量分布真实度、变化周期性等方面性能优于现有方法;进一步提出了跨城市知识联合建模方法,通过构建的基站图和基于图卷积神经网络的城市知识迁移模型,实现了基站聚合流量生成的跨城市迁移。
Mobile networks are important infrastructures for the development of modern society. The massive traffic data transmitted by mobile networks contain rich information and significant values. Supporting digital twin networks and other major applications requires conducting accurate simulations of dynamic network traffic variations. However, large-scale and high-fidelity traffic is hardly available in most cases, generating large-scale, high-fidelity, and fine-time-granularity traffic data becomes an important research topic. It has three serious challenges: first, mobile network traffic behaviors are high-dynamic; second, mobile users generate various dynamic network traffic; finally, modeling the influence of the surrounding urban environment on the aggregation traffic of base stations is difficult. To address the above challenges, this thesis focuses on four key issues: mobile network traffic analysis and device identification, traffic generation of smartphone users and Internet of Things (IoT) users at the individual level, and base station traffic generation at the aggregation level. The major innovations and contributions of this paper are as follows:First, in terms of traffic analysis and device identification, this thesis systematically analyzes the behaviors of mobile terminal devices on basic network characteristics, spatial movement and temporal activities, and extracts features to distinguish between smartphones and IoT devices, with an accuracy rate of over 95%. Further, this thesis goes deep into packet-level, flow-level and connection-level characteristics of traffic data, and selects representative features based on information gain for IoT device type identification, with an accuracy index of more than 90%.Second, in terms of smartphone user traffic generation, this thesis computes the statistical characteristics of movements and application usages of each user, and proposes a multi-scale hierarchical generative adversarial network (MSH-GAN). MSH-GAN models the traffic patterns of smartphone users at short time scale, the pattern switch modes at long time scale, and the cluster architecture among multiple smartphone users. Generating large-scale and high-fidelity smartphone user traffic based on small-scale real data, the proposed method outperforms other methods on the distribution and periodicity of traffic and data applications.Third, in terms of IoT user traffic generation, this thesis models the influencing factors of IoT user behaviors via a network knowledge graph constructed via extracting semantic information from network traffic and analyzing the attributes and behaviors of individuals, devices, and platforms. Based on the network knowledge, this thesis proposes a knowledge-enhanced generative adversarial network (GAN) to jointly model the network knowledge, device types and traffic variations of IoT users. Generating large-scale and high-fidelity IoT user traffic based on small-scale real data, the proposed method outperforms other methods on the distribution of traffic and data applications.Finally, in terms of aggregated traffic generation of base stations, this thesis uses urban knowledge graph to learn the impact of urban environment on base stations, and designs an urban knowledge-enhanced GAN, which successfully models periodic patterns and aperiodic fluctuations of base station traffic at different time scales. The proposed method outperforms other methods on the distribution and periodicity of traffic. Furthermore, this thesis proposes joint modeling method of cross-city knowledge, which transfers the aggregated traffic generation of base stations across cities via constructing graph for base stations and designing urban knowledge transfer model based on graph convolutional networks.