In this paper, we study CPU utilization time patterns of several Map-Reduce applications. After extracting running patterns of several applications, the patterns with their statistical information are saved in a reference database to be later used to tweak system parameters to efficiently execute unknown applications in future. To achieve this goal, CPU utilization patterns of new applications along with its statistical information are compared with the already known ones in the reference database to find/predict their most probable execution patterns. Because of different patterns lengths, the Dynamic Time Warping (DTW) is utilized for such comparison; a statistical analysis is then applied to DTWs' outcomes to select the most suitable candidates. Moreover, under a hypothesis, another algorithm is proposed to classify applications under similar CPU utilization patterns. Three widely used text processing applications (WordCount, Distributed Grep, and Terasort) and another application (Exim Mainlog parsing) are used to evaluate our hypothesis in tweaking system parameters in executing similar applications. Results were very promising and showed effectiveness of our approach on 5-node Map-Reduce platform
翻译:在本文中,我们研究了几个地图-拉动应用程序的CPU使用时间模式。在提取了几个应用程序的运行模式后,其统计信息模式被保存在参考数据库中,随后用于调整系统参数,以便在将来有效应用未知应用程序。为了实现这一目标,将新的应用程序的CPU使用模式及其统计资料与参考数据库中已知的查找/预测其最可能执行模式进行比较。由于模式长度不同,因此使用动态时间调整(DTW)进行这种比较;然后对DTW的结果进行统计分析,以选择最合适的候选人。此外,假设提出另一种算法,在类似的CPU使用模式下对应用程序进行分类。使用的三个广泛使用的文本处理应用程序(WordCount、分配Grep和Terasort)和另一个应用程序(Exim Mainlog parsing)被用来评价我们在执行类似应用程序时对tweaking系统参数的假设。结果非常有希望,并显示我们在5node地图-Redue平台上采用的方法的有效性。