云计算由于其易于维护和分布式地理空间的数据存储等优势近年来获得了广泛地运用。云存储服务的广泛运用为可靠的低延迟服务带来了新型解决方案的可能性,应用程序提供商可以构建一个基于云存储服务的内容分发网络(云CDN)为用户提供低延迟的服务,与传统专业CDN服务相比,云CDN对客户体量和迭代周期没有要求,服务价格低;而与自建CDN服务相比,云CDN部署时间快,易于运维。然而,云存储中的数据获取延迟具有抖动特性,这可能会影响应用程序的用户体验。本文通过对基于亚马逊AWS和微软Azure搭建的云CDN原型系统的基准测量,和对主流商业云CDN服务延迟性能的分析,发现云CDN中确实存在高尾延迟问题,尾延迟比中位数高出1076倍。通过分析延迟抖动产生的原因,发现原因复杂多样并几乎不可能避免。因此本文尝试在高抖动环境中寻找解决方案,发现利用多个云数据中心下载数据可以有效地减少云CDN中用户感知的尾延迟。云服务提供商根据带宽使用量收取费用,而应用程序提供商的成本预算是有限的。因此,如何在成本约束下设计最优的用户请求调度方案成为一个关键研究问题。为了在满足成本约束条件下优化高尾延迟问题,本文提出了工作负载调度框架TailCutter,向不同的云数据中心发送不同数据块的并行下载请求。首先针对尾延迟最小化问题建立数学模型。接着设计了在多项式时间中解决该问题的离线算法——最大尾延迟最小化算法(MTMA)。然而,由于MTMA需要带宽和工作负载信息的全局视角,因此其适用性有限。为了进一步解决这个限制,本文设计了更实用的在线算法——基于滚动时域控制的最大尾延迟最小化算法(RHC-based MTMA),它根据未来时域的带宽预测来调度请求,并随着时域滚动调整调度结果,从而实现在成本约束下有效且稳健地优化尾延迟。为了验证解决方案的有效性,本文在亚马逊AWS和微软Azure的云数据中心实现TailCutter的原型系统,使用从主流电信运营商收集来的2TB真实世界的数据集对原型系统进行评估,并进行了大规模的仿真评估。评估结果表明,TailCutter可有效削减达73.3%的尾延迟。
Cloud computing has been widely used in recent years for its advantages of easy maintenance and distributed geospatial data storage. The widespread use of cloud storage services provides new possibilities of novel solutions for reliable and low-latency services. Application providers can build a content distribution network based on cloud storage services (Cloud CDN) to provide low-latency services. Compared with traditional professional CDN services, cloud CDNs have low service price and no requirements of customer volume and iteration cycles. Compared with self-built CDN services, cloud CDNs are easy and efficient to deploy, operate and maintain. However, the data retrieving latency in storage clouds is highly variable, which may degrade the user experience of applications.First of all, through benchmark measurements on Amazon AWS and Microsoft Azure together with an analysis of a large-scale dataset collected from a major cloud CDN provider, we identify the high tail latency problem in cloud CDNs. Surprisingly, we find that the tail latency can be up to 1076x of the median! The reasons of these variances are various and complex, and it is almost impossible to avoid such variances. We attempt to find a solution in a high variance environment and find that using multiple cloud data centers to download data can effectively reduce the user-perceived tail latency in cloud CDNs. However, cloud storage service providers charge fees based on bandwidth usage while the application provider has limited budget. Therefore, how to design the optimal user request scheduling scheme under cost constraints becomes a key research issue.Then, in order to solve the high tail latency problem, we proposed a workload scheduling framework TailCutter, which sends parallel download requests for different data chunks of the file to different cloud data centers to optimize the tail latency under cost constraints. This paper first models Tail Delay Minimization (TLM) problem. Then design the offline algorithm, Maximum Tail Minimization Algorithm (MTMA) to solve the TLM problem in polynomial time. However, since MTMA requires a global view of bandwidth and workload information, its practicality is limited. To further address this limitation, this paper designs a more practical online algorithm, Receding Horizon Control Based Maximum Tail Minimization Algorithm (RHC-based MTMA), which schedules requests based on the bandwidth prediction in a short future horizon, and adjusts the scheduling results over time to efficiently and robustly optimize the tail latency under cost constraints.Finally, this paper implements the TailCutter prototype system in the cloud data centers based on Amazon AWS and Microsoft Azure, uses a 2TB real-world dataset collected from major ISP to evaluate the prototype system, and conducts a large-scale simulation evaluation. Evaluation results show that TailCutter effectively cuts up to 73.3% of the tail user-perceived latency.