Recent research has shown that dynamic zeros in shader programs of gaming applications can be effectively leveraged with a profile-guided, code-versioning transform. This transform duplicates code, specializes one path assuming certain key program operands, called versioning variables, are zero, and leaves the other path unspecialized. Dynamically, depending on the versioning variable's value, either the specialized fast path or the default slow path will execute. Prior work applied this transform manually and showed promising gains on gaming applications. In this paper, we present AZP, an automatic compiler approach to perform the above code-versioning transform. Our framework automatically determines which versioning variables or combinations of them are profitable, and determines the code region to duplicate and specialize (called the versioning scope). AZP takes operand zero value probabilities as input and it then uses classical techniques such as constant folding and dead-code elimination to determine the most profitable versioning variables and their versioning scopes. This information is then used to affect the final transform in a straightforward manner. We demonstrate that AZP is able to achieve an average speedup of 16.4% for targeted shader programs, amounting to an average frame-rate speedup of 3.5% across a collection of modern gaming applications on an NVIDIA GeForce RTX 2080 GPU GPU.
翻译:最近的研究表明, 游戏应用程序阴影程序中的动态零可以通过配置制导、 代码转换变换来有效地加以利用。 此变换重复代码, 专门设定一条路径, 假设某些关键程序变体, 称为版本变量, 是零, 使其他路径没有专门化。 动态地, 取决于变体的版本值, 要么是专用快速路径, 要么是默认缓慢路径 。 之前的工作是手工应用这种变换, 并显示了在游戏应用程序中取得的有希望的收益。 在本文中, 我们展示了 AZP, 一种执行上述代码转换变换的自动编译器方法。 我们的框架自动决定了这些变异体或组合的哪个版本有利可图, 并确定要复制和专门化的代码区域( 称为版本范围 ) 。 AZP 将操作零值概率作为输入, 然后使用经典技术, 如固定的折叠和死码消除等, 来决定最有利可图的变换变量及其版本范围。 此信息被用来以直截地影响最终变换。 我们证明, AZP 能够实现16. AS- ASVA ASVA 平均速度的G. AS. AS. AS. 0A.