When trying to solve a computational problem we are often faced with a choice among algorithms that are all guaranteed to return the right answer but that differ in their runtime distributions (e.g., SAT solvers, sorting algorithms). This paper aims to lay theoretical foundations for such choices by formalizing preferences over runtime distributions. It might seem that we should simply prefer the algorithm that minimizes expected runtime. However, such preferences would be driven by exactly how slow our algorithm is on bad inputs, whereas in practice we are typically willing to cut off occasional, sufficiently long runs before they finish. We propose a principled alternative, taking a utility-theoretic approach to characterize the scoring functions that describe preferences over algorithms. These functions depend on the way our value for solving our problem decreases with time and on the distribution from which captimes are drawn. We describe examples of realistic utility functions and show how to leverage a maximum-entropy approach for modeling underspecified captime distributions. Finally, we show how to efficiently estimate an algorithm's expected utility from runtime samples.
翻译:当我们试图解决计算问题时,我们常常遇到一种在算法中的选择,这些算法都保证返回正确的答案,但在运行时间分布上却各不相同(例如,SAT求解器、分解算法)。本文的目的是通过在运行时间分布中将偏好正规化,为这种选择奠定理论基础。我们似乎应该简单地选择将预期运行时间减少到最低的算法。然而,这种偏好将受到我们的算法对错误投入的精确速度的驱动,而在实践中,我们通常愿意偶尔中断,在它们完成之前足够长的时间运行。我们提出了一个有原则的替代方法,即采用实用理论方法来描述描述描述描述对算法的偏好。这些函数取决于我们用时间来解决问题的方式,以及从中抽取固定时间分布的方法。我们描述了现实的功用功能的例子,并展示如何在设定的上限分配模式下利用最大功率方法进行建模。最后,我们展示如何高效地估计运行时间样本对算法的预期效用。