Data summarizations are a valuable tool to derive knowledge from large data streams and have proven their usefulness in a great number of applications. Summaries can be found by optimizing submodular functions. These functions map subsets of data to real values, which indicate their "representativeness" and which should be maximized to find a diverse summary of the underlying data. In this paper, we studied Exemplar-based clustering as a submodular function and provide a GPU algorithm to cope with its high computational complexity. We show, that our GPU implementation provides speedups of up to 72x using single-precision and up to 452x using half-precision computation compared to conventional CPU algorithms. We also show, that the GPU algorithm not only provides remarkable runtime benefits with workstation-grade GPUs but also with low-power embedded computation units for which speedups of up to 35x are possible. Furthermore, we apply our algorithm to real-world data from injection molding manufacturing processes and discuss how found summaries help with steering this specific process to cut costs and reduce the manufacturing of bad parts. Beyond pure speedup considerations, we show, that our approach can provide summaries within reasonable time frames for this kind of industrial, real-world data.
翻译:数据总和是从大型数据流中获取知识的宝贵工具,并且已经证明了其在大量应用中的有用性。 通过优化子模式函数可以找到摘要。 这些函数将数据子集映射成真实值, 显示其“ 代表性”, 并且应该最大限度地寻找基础数据的不同摘要。 在本文中, 我们研究以Exmplar为基础的组群作为子模式函数, 并提供一种 GPU 算法, 以应对其高计算复杂性。 我们显示, 我们的 GPU 实施提供了最多72x的超速, 以及452x的超速, 与常规的 CPU 算法相比, 使用半精度计算 。 我们还显示, GPU 算法不仅为工作站级级GPU 提供了显著的运行时间效益, 而且还为低功率嵌入计算器提供了显著的运行时间效益, 其速度可达35x 。 此外, 我们将我们的算法应用到从注入模具模型制造过程得出的真实世界数据中, 并讨论找到的摘要如何帮助引导这一特定过程削减成本, 并减少坏部分的制造过程。 除了纯粹的快速时间框架外, 我们显示, 我们的工业方法可以提供在现实中提供合理的时间框架内 。