Standard library implementations of functions like sin and exp optimize for accuracy, not speed, because they are intended for general-purpose use. But applications tolerate inaccuracy from cancellation, rounding error, and singularities-sometimes even very high error-and many application could tolerate error in function implementations as well. This raises an intriguing possibility: speeding up numerical code by tuning standard function implementations. This paper thus introduces OpTuner, an automatic method for selecting the best implementation of mathematical functions at each use site. OpTuner assembles dozens of implementations for the standard mathematical functions from across the speed-accuracy spectrum. OpTuner then uses error Taylor series and integer linear programming to compute optimal assignments of function implementation to use site and presents the user with a speed-accuracy Pareto curve they can use to speed up their code. In a case study on the POV-Ray ray tracer, OpTuner speeds up a critical computation, leading to a whole program speedup of 9% with no change in the program output (whereas human efforts result in slower code and lower-quality output). On a broader study of 37 standard benchmarks, OpTuner matches 216 implementations to 89 use sites and demonstrates speed-ups of 107% for negligible decreases in accuracy and of up to 438% for error-tolerant applications.
翻译:标准图书馆执行功能, 如罪与表达的精度, 而不是速度, 因为它们是通用的 。 但应用程序容忍取消、 圆差和奇数的不准确性, 有时甚至非常高的错误和许多应用程序可能容忍功能执行中的错误。 这提出了一种令人感兴趣的可能性: 通过调整标准功能执行来加快数字代码。 本文由此介绍了 OpTuner, 这是在每个使用网站选择数学函数最佳执行的自动方法 OpTuner 。 OpTuner 收集了几十个标准数学函数的全程序执行, 并且没有变化。 OpTuner 使用泰勒 序列和整线性线性编程来计算功能执行的最佳任务, 以便使用网站, 并给用户提供一个速度精确的Pareto 曲线, 以加速执行其代码。 在对 POV- Ray 射线追踪器的案例研究中, OpTuner 加快了关键计算, 导致整个程序加速了9%, 程序输出没有变化( 人类努力的结果是更慢的代码和低质量的输出 ) 。 在更宽的进度点上, 展示了107- 标准点 的精确性 4, 将执行速度比 10 的进度比 的进度点的进度比 4 。