登录 EN

添加临时用户

全球IPv6活跃地址探测算法研究和原型系统实现

Algorithm Research and Prototype System Implementation of Global IPv6 Active Address Probing

作者:李果
  • 学号
    2018******
  • 学位
    硕士
  • 电子邮箱
    199******com
  • 答辩日期
    2021.05.20
  • 导师
    杨家海
  • 学科名
    网络空间安全
  • 页码
    82
  • 保密级别
    公开
  • 培养单位
    412 网络研究院
  • 中文关键词
    IPv6,主动探测,强化学习,图社区发现,可视化
  • 英文关键词
    IPv6, active scanning, reinforcement learning, graph community detection, visualization

摘要

随着 IPv4 地址资源耗尽, IPv6 在全球范围内加快部署。全球 IPv6 活跃地址探 测对安全漏洞分析和网络资产监控都至关重要。然而,传统的暴力扫描技术无法 用于探测大范围的 IPv6 地址空间。取而代之的方法是从各种数据源收集 IPv6 活跃 地址即种子地址,并通过学习种子地址的结构规律推测可能活跃的 IPv6 新地址。尽管这类方法成为目前主流的 IPv6 探测技术手段,但是较低的命中率和过度依赖 种子地址的特点使其难以适用于全球 IPv6 活跃地址探测。为了解决上述问题,本文研究如何提高基于种子地址的探测算法命中率,以 及如何在缺乏种子地址的边界网关协议 (Border Gateway Protocol, BGP) 前缀下进 行活跃地址探测。本文依托强化学习、数据挖掘、可视化技术,设计和实现了一套 全球 IPv6 活跃地址探测原型系统。研究内容和主要贡献如下: 1. 提出了一种基于强化学习的活跃地址探测算法 6TS。该算法首先根据地址的 多维信息对种子地址进行分类,并采用四种表示策略进行地址空间建模。然 后引入汤普森采样模型,在扫描阶段动态调整探测目标,从而优先探测活跃 地址数量较多的区域。实验结果表明在千万量级种子地址作为输入的条件 下, 6TS 比最先进的基于种子地址的探测算法提升了 10.5% 的命中率。2. 提出了一种基于图社区发现的活跃地址探测方法 GAG。该方法采用图数据 结构描述不同网络下地址结构规律之间的相似程度,并使用图社区发现算法 挖掘出通用地址结构规律,然后通过组织关联策略和相似度匹配策略将通用 地址结构规律迁移到任意 BGP 前缀下进行活跃地址探测。实验结果表明在 探测十多万 BGP 前缀的条件下, GAG 能够发现 1.5 亿新的活跃 IPv6 地址, 是最先进探测算法的 113 倍,并覆盖 81.6% 的 BGP 前缀。3. 设计并实现了全球 IPv6 活跃地址探测原型系统。该原型系统提供 IPv6 地址 数据的高效搜索服务,并且整合了多种可视化技术方案,以便为网络管理员 直观的展示地址活跃情况和统计数据。该系统能够自动化地进行长期主动探 测,积累全球 IPv6 活跃地址,进而为网络拓扑研究、安全研究、客户端地址 研究提供数据支持。

With the exhaustion of IPv4 address resources, IPv6 is rapidly deployed around the world. Global IPv6 active address probing is very important for security vulnerability analysis and network asset monitoring. However, the traditional brute-force scanning technology can not be used to probe the entire IPv6 address space. The alternative method is to collect IPv6 active addresses, namely seed addresses, from various data sources, and infer new IPv6 possibly active addresses by learning the structure of seed addresses. Although this kind of method becomes the most popular IPv6 probing technology, its low hit rate and excessive reliance on seed addresses make it difficult to apply to global IPv6 active address probing. In order to solve the above problems, this thesis studies how to improve hit rate of the probing algorithm based on seed addresses, and how to do active probing under BGP prefixes with almost no seed addresses. Based on reinforcement learning, data mining and visualization technology, this thesis designs and implements a global IPv6 active address probing prototype system. The research contents and main contributions are as follows:It proposes an active address probing algorithm based on reinforcement learning, which is called 6TS. The algorithm first classifies a large number of seed addresses according to the multi-dimensional information of these addresses, and uses four representation strategies to model the address space. Then, the Thompson sampling model is introduced to dynamically adjust the probed target during the scanning phase, so as to preferentially probe areas with a large number of active addresses.Experimental results show that under the condition of tens of millions of seed addresses as input, the hit rate of 6TS is 10.5% higher than the state-of-the-art probing algorithm based on seed addresses.It proposes an active address probing method based on graph community detection, which is called GAG. The method introduces graph data structure to describe the similarity degree among address structure patterns in different networks, and use graph community detection algorithms to mine general address structure rules. Then, these general address structure rules are migrated to BGP prefixes for address generation via organization association strategy and similarity matching strategy. Experimental results show that under the condition of probing more than one hundred thousand BGP prefixes, GAG can discover 150 million new active IPv6 addresses, which is 113 times of the state-of-the-art probing algorithm, and covers 81.6% of BGP prefixes.It designs and implements a global IPv6 active address probing prototype system. The prototype system provides an efficient search service for IPv6 address data, and integrates a variety of visualization technology solutions, to intuitively show address activity trend and statistical data for network administrators. The system can automatically do long-term active probing, and accumulate global IPv6 active addresses. Then it provides data support for network topology research, security research and client address research.