Type inference methods based on deep learning are becoming increasingly popular as they aim to compensate for the drawbacks of static and dynamic analysis approaches, such as high uncertainty. However, their practical application is still debatable due to several intrinsic issues such as code from different software domains will involve data types that are unknown to the type inference system. In order to overcome these problems and gain high-confidence predictions, we thus present TIPICAL, a method that combines deep similarity learning with novelty detection. We show that our method can better predict data types in high confidence by successfully filtering out unknown and inaccurate predicted data types and achieving higher F1 scores to the state-of-the-art type inference method Type4Py. Additionally, we investigate how different software domains and data type frequencies may affect the results of our method.
翻译:暂无翻译