Given a basic block of instructions, finding a schedule that requires the minimum number of registers for evaluation is a well-known problem. The problem is NP-complete when the dependences among instructions form a directed-acyclic graph instead of a tree. We are striving to find efficient approximation algorithms for this problem not simply because it is an interesting graph optimization problem in theory. A good solution to this problem is also an essential component in solving the more complex instruction scheduling problem on GPU. In this paper, we start with explanations on why this problem is important in GPU instruction scheduling. We then explore two different approaches to tackling this problem. First we model this problem as a constraint-programming problem. Using a state-of-the-art CP-SAT solver, we can find optimal answers for much larger cases than previous works on a modest desktop PC. Second, guided by the optimal answers, we design and evaluate heuristics that can be applied to the polynomial-time list scheduling algorithms. A combination of those heuristics can achieve the register-pressure results that are about 16\% higher than the optimal minimum on average. However, there are still near 3\% cases in which the register pressure by the heuristic approach is 50\% higher than the optimal minimum.
翻译:根据基本的指示块,找到一个需要最低数量的评价登记册的时间表是一个众所周知的问题。当指示之间的依赖形成定向环绕图而不是树时,问题就是NP:问题就已经完全。我们正在努力为这一问题找到有效的近似算法,不仅仅是因为它在理论上是一个有趣的图形优化问题。一个很好的解决这一问题的方法也是解决GPU上更复杂的教学时间安排问题的一个基本组成部分。在本文件中,我们首先解释为什么这个问题在GPU教学时间安排中很重要。然后我们探讨解决这一问题的两种不同方法。首先,我们把这一问题作为制约性方案问题来模型。我们使用最先进的CP-SAT求解答器,我们就能找到比以前在适度的台式计算机计算机上做的工作大得多的最好答案。第二,在最佳答案的指导下,我们设计和评价可适用于多式时间列表列表时间安排算法的超自然学。这些超自然学综合方法可以取得比最佳压力最低程度高大约16<unk> /%/ 。然而,在平均压力上,最起码是50个。</s>