We present a new approach to the type inference problem for dynamic languages. Our goal is to combine \emph{logical} constraints, that is, deterministic information from a type system, with \emph{natural} constraints, that is, uncertain statistical information about types learnt from sources like identifier names. To this end, we introduce a framework for probabilistic type inference that combines logic and learning: logical constraints on the types are extracted from the program, and deep learning is applied to predict types from surface-level code properties that are statistically associated. The foremost insight of our method is to constrain the predictions from the learning procedure to respect the logical constraints, which we achieve by relaxing the logical inference problem of type prediction into a continuous optimisation problem. We build a tool called OptTyper to predict missing types for TypeScript files. OptTyper combines a continuous interpretation of logical constraints derived by classical static analysis of TypeScript code, with natural constraints obtained from a deep learning model, which learns naming conventions for types from a large codebase. By evaluating OptTyper, we show that the combination of logical and natural constraints yields a large improvement in performance over either kind of information individually and achieves a 4% improvement over the state-of-the-art.
翻译:我们对动态语言的类型推断问题提出了一种新的方法。 我们的目标是将 kemph{ logy} 限制, 即来自类型系统的确定性信息与 emph{ 自然} 限制结合起来, 也就是说, 从标识名称等来源获得的类型的统计信息不确定。 为此, 我们引入了一种将逻辑和学习结合起来的概率类型推断框架: 从程序中提取了类型上的逻辑限制, 并应用了深度学习来预测在统计上相连的地表代码属性的类别。 我们最深刻的洞察力是限制从学习程序中对逻辑限制的预测, 以尊重逻辑限制, 我们通过将类型预测的逻辑推论问题放松到持续的优化问题。 我们建立了一个称为OptTyper 的工具, 以预测类型Script文档的缺失类型。 OptTyper 结合了对逻辑制约的连续解释, 由对类型Script 代码的经典静态分析得出的逻辑约束, 以及从深层次学习模型中获取的自然制约, 以从一个大的代码库中为类型命名公约, 来, 的自然约束, 来我们从一个逻辑改进的学习模式, 来, 通过对大类型定义4Pt 的改进, 我们通过评估一个单项 和自然的组合, 的特性的特性的特性的改进, 的特性的组合, 将实现, 的改进。