Type4Py: Python 的深相似性学习类型推断 (Type4Py: Deep Similarity Learning-Based Type Inference for Python)

from arxiv, Type4Py's source code and dataset can be retrieved here: https://github.com/mir-am/type4py-paper The second version of the paper is published in Jul. 2021

Dynamic languages, such as Python and Javascript, trade static typing for developer flexibility and productivity. Lack of static typing can cause run-time exceptions and is a major factor for weak IDE support. To alleviate these issues, PEP 484 introduced optional type annotations for Python. As retrofitting types to existing codebases is error-prone and laborious, learning-based approaches have been proposed to enable automatic type annotations based on existing, partially annotated codebases. However, it is still quite challenging for learning-based approaches to give a relevant prediction in the first suggestion or the first few ones. In this paper, we present Type4Py, a deep similarity learning-based hierarchical neural network model that learns to discriminate between types of the same kind and dissimilar types in a high-dimensional space, which results in clusters of types. Nearest neighbor search suggests a list of likely types for arguments, variables, and functions' return. The results of the quantitative and qualitative evaluation indicate that Type4Py significantly outperforms state-of-the-art approaches at the type prediction task. Considering the Top-1 prediction, Type4Py obtains a Mean Reciprocal Rank of 72.5%, which is 10.87% and 16.45% higher than that of Typilus and TypeWriter, respectively.

翻译：动态语言, 如 Python 和 Javascript 和 Javascript 等动态语言, 用于开发者灵活性和生产率的贸易静态打字。静态打字的缺乏可能导致运行时的例外, 并且是IDDE支持薄弱的一个主要因素。为了缓解这些问题, PEP 484 引入了Python 可选类型的说明。由于对现有代码库的改装类型容易出错, 且难度很大, 已经提出了基于学习的方法, 以便根据现有的部分附加说明的代码库进行自动类型说明。然而, 定量和定性评估的结果表明, 以学习为基础的方法在第一个建议或最初几个建议中作出相关预测仍然相当困难。在本文中, 我们展示了一种基于深度相似学习的基于学习的神经网络模型, 在高维空间中, 学会区分同类类型和不同类型, 从而导致类型。近邻搜索显示一个可能的参数、变量和函数返回的类别清单。然而, 定量和定性评估结果表明, 类型4Py 4Py 明显超出类型预测任务中状态的状态方法。。考虑到Top1 4PyPyal- 42.5 和 Tyal 4- 4, 4, 4- 45 的等级分别为16 的等级的等级, 的等级为R- syripal 的等级为16 级, 级的等级为16, 级, 级, 级, yal- b b 级, 4- syal- syal- sal- sal- b 。