Dynamically typed languages such as JavaScript and Python have emerged as the most popular programming languages in use. Important benefits can accrue from including type annotations in dynamically typed programs. This approach to gradual typing is exemplified by the TypeScript programming system which allows programmers to specify partially typed programs, and then uses static analysis to infer the remaining types. However, in general, the effectiveness of static type inference is limited and depends on the complexity of the program's structure and the initial type annotations. As a result, there is a strong motivation for new approaches that can advance the state of the art in statically predicting types in dynamically typed programs, and that do so with acceptable performance for use in interactive programming environments. Previous work has demonstrated the promise of probabilistic type inference using deep learning. In this paper, we advance past work by introducing a range of graph neural network (GNN) models that operate on a novel type flow graph (TFG) representation. The TFG represents an input program's elements as graph nodes connected with syntax edges and data flow edges, and our GNN models are trained to predict the type labels in the TFG for a given input program. We study different design choices for our GNN models for the 100 most common types in our evaluation dataset, and show that our best two GNN configurations for accuracy achieve a top-1 accuracy of 87.76% and 86.89% respectively, outperforming the two most closely related deep learning type inference approaches from past work -- DeepTyper with a top-1 accuracy of 84.62% and LambdaNet with a top-1 accuracy of 79.45%. Further, the average inference throughputs of those two configurations are 353.8 and 1,303.9 files/second, compared to 186.7 files/second for DeepTyper and 1,050.3 files/second for LambdaNet.
翻译:动态键入语言, 如 JavaScript 和 Python 等动态键入语言( JavaScript 和 Python ) 已经成为最受欢迎的编程语言。 动态键入程序中包含类型说明可以带来重要的好处。 这种逐步打字的方法以TypeScript 编程系统为范例, 使程序员能够指定部分的编程程序, 然后使用静态分析来推断剩余类型。 然而, 一般来说, 静态类型神经网络的有效性有限, 取决于程序结构的复杂性和初始类型说明。 因此, 有一种强大的动力, 新的方法可以在动态型式程序中的静态预测类型中提高艺术的准确性, 静态预测在动态型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内准确。 和二号内型内型内型内型内型内型内型内型内基内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内, 有二内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内,, 内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内,内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内型内,内型内型