可靠传输协议的性能优化一直是学术界和工业界关注热点。可靠传输协议包含拥塞控制算法,流量控制机制和丢失恢复机制三个关键部分。大多数研究聚焦在设计新的拥塞控制算法,但是流量控制机制和丢失恢复机制也会成为性能瓶颈。本文揭示了异常的流量控制机制会导致网络带宽不能被充分利用,而依赖于超时重传的丢失恢复会导致高延时的问题。本文针对这两个问题提出了网络可靠传输协议的优化策略与增强机制。最终取得了以下研究成果: (1)揭示了TCP capping现象并提出了抑制其影响的方案。本文发现了一种异常的流量控制现象TCP capping,即接收窗口没有完全反映接收端处理能力但是限制了发送端数据发送。为了帮助内容服务提供商抑制TCP capping对性能的影响,本文提出了Apollo算法。Apollo在服务器端通过观察接收窗口变化来探测接收端的处理能力,并协助TCP决定发送的数据包数量。在小规模测试床和生产环境的CDN评估中,Apollo表现出显著的性能提升。(2)设计了反映真实接收端处理能力的接收缓存分配算法。为了从根本上解决流量控制存在的问题,例如TCP capping,本文设计了新的TCP接收缓存分配算法RBA。RBA能够估计出网络的带宽延时积和接收端处理能力延时积,及时地为每个连接调节出满足网络带宽和接收端处理能力的接收缓存。实验结果表明,RBA能够适应不同场景,实现高吞吐量,消耗较少内存,并保持协议的公平性。(3)提出了增强丢失恢复的显式丢包通知机制。针对数据中心网络超时重传导致的高延时问题,本文提出了显式丢包通知机制EDN。EDN的基本思想是直接将丢包通知发送给源端,进行精准及时的重传。EDN还暗含连接容量的信息,该信息有助于拥塞控制快速地收敛到合适速率。本文在可编程交换机上实现了EDN原型。小规模测试床和大规模仿真表明EDN可以显著降低平均延时和尾部延时。(4)提出了适用于RDMA网卡丢失恢复的增强机制。为兼容低延时数据中心大规模部署的RDMA网卡,本文针对RDMA网卡丢失恢复的特性,遵循显式丢包通知的理念,设计了Lightning增强机制来高效地执行丢失恢复。Lightning允许交换机直接向对应的源端发送重传信令,同时过滤不必要的乱序数据包来优化传统RDMA网卡中的回退N步重传,避免浪费瓶颈带宽。实验表明Lightning显著地降低了RDMA网络传输的延时。
Optimizing the performance of reliable transport protocols has always been a hot topic in both industry and academia. The reliable transport protocol contains three key components: congestion control algorithm, flow control mechanism, and loss recovery mechanism. Most of these studies focus on the novel design of congestion control algorithms. Nevertheless, flow control mechanism and loss recovery mechanism can also become performance bottlenecks. This thesis reveals that abnormal flow control makes network bandwidth underutilized, and loss recovery that relies on retransmission timeouts results in high latency. From these two aspects, this thesis proposes optimization strategies and enhancement mechanisms for reliable transport protocols. This thesis achieves the following contributions:(1) We reveal TCP capping phenomenon and propose a scheme to curb its performance damage. This thesis demystifies an abnormal flow control phenomenon termed TCP capping, which is, the receive window does not reflect the receiver's processing capability but throttles packet sending. To help Internet content providers curb the impacts of TCP capping, this thesis provides Apollo algorithm. It probes the receiver's processing capability according to the change of the receive window at the server-side and assists TCP in deciding the number of sending packets. Both small-scale testbeds and production CDN evaluations show significant performance gains from the assistance of Apollo.(2) We design a novel receive buffer allocation algorithm that truly reflects the receiver's processing capability. To fundamentally solve the problems of the existing flow control mechanism, this thesis proposes a novel TCP receive buffer allocation algorithm RBA. RBA estimates the network bandwidth delay product and the receiver's processing capability delay product, and then adjusts the receive buffer in time to adapt the network bandwidth and the receiver's processing capability for each connection. Experiments show that RBA can adapt to different scenarios, thereby achieving high throughput, consuming less memory, and maintaining the fairness of the protocol.(3) We propose an explicit dropping notification to enhance loss recovery. To address the high latency caused by retransmission timeouts in data center networks, this thesis proposes an Explicit Dropping Notification (EDN). The basic idea of EDN is directly sending the packet loss notification from the switch to the source for precise and timely retransmission. In addition, EDN implies the network capacity, which aids congestion control to converge to an appropriate rate quickly. We build an EDN prototype on a P4 programmable switch. Both small-scale testbeds and large-scale simulations show that EDN can significantly reduce the average latency and tighten the tail latency.(4) We propose an enhancement for the loss recovery in RDMA NICs. Being compatible with the RDMA NICs that are widely deployed in data centers to achieve low latency, this thesis considers the feature of the loss recovery in RDMA NICs and follows the concept of explicit dropping notification to provide an enhancement mechanism called Lightning for effective loss recovery. Lightning allows the switch to send the retransmission signal directly to the source. Lightning also filters out unnecessary out-of-order packets to reduce redundant retransmissions due to go-back-N in RDMA NICs, thereby avoiding wasting bottleneck bandwidth. Experiments show that Lightning significantly reduces the latency.