This paper contributes towards better understanding the energy consumption trade-offs of HPC scale Artificial Intelligence (AI), and more specifically Deep Learning (DL) algorithms. For this task we developed benchmark-tracker, a benchmark tool to evaluate the speed and energy consumption of DL algorithms in HPC environments. We exploited hardware counters and Python libraries to collect energy information through software, which enabled us to instrument a known AI benchmark tool, and to evaluate the energy consumption of numerous DL algorithms and models. Through an experimental campaign, we show a case example of the potential of benchmark-tracker to measure the computing speed and the energy consumption for training and inference DL algorithms, and also the potential of Benchmark-Tracker to help better understanding the energy behavior of DL algorithms in HPC platforms. This work is a step forward to better understand the energy consumption of Deep Learning in HPC, and it also contributes with a new tool to help HPC DL developers to better balance the HPC infrastructure in terms of speed and energy consumption.
翻译:本文有助于更好地了解高聚苯乙烯规模人工智能算法(AI)的能源消耗权衡,更具体地说,深入学习算法(DL)的能源消耗权衡。我们为这项任务开发了基准跟踪器,这是评估高聚苯乙烯环境中DL算法的速度和能源消耗的基准工具。我们利用硬件计数器和Python图书馆通过软件收集能源信息,这使我们能够使用已知的AI基准工具,并评估许多DL算法和模型的能源消耗量。我们通过实验活动展示了一个案例,说明基准跟踪器在计算速度和用于培训和推断DL算法的能源消耗量方面的潜力,以及基准跟踪器在帮助更好地了解高聚苯乙烯平台中DL算法的能源行为方面的潜力。这项工作向前迈出了一步,以更好地了解高聚苯乙烯深度学习的能源消耗量,它还有助于采用新的工具帮助高聚苯乙烯DL开发者在速度和能源消耗方面更好地平衡高聚苯酯基础设施。