Cloud data centers are rapidly evolving. At the same time, large-scale data analytics applications require non-trivial performance tuning that is often specific to the applications, workloads, and data center infrastructure. We propose TeShu, which makes network shuffling an extensible unified service layer common to all data analytics. Since an optimal shuffle depends on a myriad of factors, TeShu introduces parameterized shuffle templates, instantiated by accurate and efficient sampling that enables TeShu to dynamically adapt to different application workloads and data center layouts. Our experimental results with real-world graph workloads show that TeShu efficiently enables shuffling optimizations that improve performance and adapt to a variety of scenarios.
翻译:云层数据中心正在迅速发展。 与此同时,大规模数据分析应用需要非三轨性能调整,这往往与应用程序、工作量和数据中心基础设施有关。我们建议TeShu。我们建议TeShu,它使网络打乱一个可扩展的统一服务层成为所有数据分析工具所共有的。由于最佳洗涤取决于多种因素,TeShu引入了参数化的洗发样板,通过精确高效的取样即时操作,使TeShu能够动态地适应不同的应用工作量和数据中心布局。我们用真实世界图表工作量进行的实验结果显示,TeShu能够高效地洗发优化,从而改进性能并适应各种情景。