Predicting the performance of production code prior to actually executing or benchmarking it is known to be highly challenging. In this paper, we propose a predictive model, dubbed TEP-GNN, which demonstrates that high-accuracy performance prediction is possible for the special case of predicting unit test execution times. TEP-GNN uses FA-ASTs, or flow-augmented ASTs, as a graph-based code representation approach, and predicts test execution times using a powerful graph neural network (GNN) deep learning model. We evaluate TEP-GNN using four real-life Java open source programs, based on 922 test files mined from the projects' public repositories. We find that our approach achieves a high Pearson correlation of 0.789, considerable outperforming a baseline deep learning model. However, we also find that more work is needed for trained models to generalize to unseen projects. Our work demonstrates that FA-ASTs and GNNs are a feasible approach for predicting absolute performance values, and serves as an important intermediary step towards being able to predict the performance of arbitrary code prior to execution.
翻译:在实际执行或制定基准之前预测生产代码的性能,众所周知,这是极具挑战性的。在本文中,我们提出了一个预测模型,称为TEP-GNN,表明对单位测试执行时间的特殊情况可以作出高度精确的性能预测。TEP-GNN使用FA-AST, 或流动增强AST, 作为一种基于图形的代号代表方法, 并使用一个强大的图形神经网络(GNN)深层学习模型预测测试执行时间。我们用四个真实的爪哇开放源程序评估TEP-GNN, 其基础是从项目的公共储存库中提取的922个测试文档。我们发现,我们的方法达到了高比尔森的对应值0.789,大大超出一个基线深度学习模型。但我们也发现,受过训练的模型需要做更多的工作,才能对看不见的项目进行概括。我们的工作表明,FA-AST和GNNN是预测绝对性性能的可行方法,并且是一个重要的中间步骤,以能够预测执行前任意代码的性能。