Recent years have seen the adoption of Machine Learning (ML) techniques to predict the performance of large-scale applications, mostly at a coarse level. In contrast, we propose to use ML techniques for performance prediction at much finer granularity, namely at the levels of Basic Block (BB), which are the single entry-single exit code blocks that are used as analysis tools by all compilers to break down a large code into manageable pieces. Utilizing ML and BB analysis together can enable scalable hardware-software co-design beyond the current state of the art. In this work, we extrapolate the basic block execution counts of GPU applications for large inputs sizes from the counts of smaller input sizes of the same application. We employ two ML models, a Poisson Neural Network (PNN) and a Bayesian Regularization Backpropagation Neural Network (BR-BPNN). We train both models using the lowest input values of the application and random input values to predict basic block counts. Results show that our models accurately predict the basic block execution counts of 16 benchmark applications. For PNN and BR-BPNN models, we achieve an average accuracy of 93.5% and 95.6%, respectively, while extrapolating the basic block counts for large input sets when the model is trained using smaller input sets. Additionally, the models show an average accuracy of 97.7% and 98.1%, respectively, while predicting basic block counts on random instances.
翻译:近些年来,我们采用了机器学习(ML)技术来预测大规模应用的性能,大多是粗略的。相比之下,我们提议使用ML技术来预测大型应用的性能,以更细得多的颗粒度,即基本块(BB)的性能预测,这是所有编译者用来作为分析工具的单一入单退出代码块,以便将一个大代码破解成可管理的片块。我们利用ML和BBB分析,可以使可缩放的硬件软件共同设计超出当前水平。在这项工作中,我们从同一应用程序较小输入大小的数中推断出GPU应用程序的基本区块执行数。我们使用两个ML模型,即Poisson神经网络(PNN)和BA-BNNB的正统性调整后神经网络(BB-BNN),将两个模型合并成两个模型,使用应用程序最低输入值和随机输入值来预测基本区块数。结果显示我们模型精确地预测16个基准应用的基本区块执行数。对于PNN和B-BNN的大规模投入规模,我们分别使用两个模型的98%的精确度,同时用一个经过训练的精确度计算,同时用一个平均的98点计算。