Android is among the most targeted platform by attackers. While attackers are improving their techniques, traditional solutions based on static and dynamic analysis have been also evolving. In addition to the application code, Android applications have some metadata that could be useful for security analysis of applications. Unlike traditional application distribution mechanisms, Android applications are distributed centrally in mobile markets. Therefore, beside application packages, such markets contain app information provided by app developers and app users. The availability of such useful textual data together with the advancement in Natural Language Processing (NLP) that is used to process and understand textual data has encouraged researchers to investigate the use of NLP techniques in Android security. Especially, security solutions based on NLP have accelerated in the last 5 years and proven to be useful. This study reviews these proposals and aim to explore possible research directions for future studies by presenting state-of-the-art in this domain. We mainly focus on NLP-based solutions under four categories: description-to-behaviour fidelity, description generation, privacy and malware detection.
翻译:攻击者正在改进他们的技术,但基于静态和动态分析的传统解决方案也在不断演变。除了应用代码外, Android应用程序还有一些可用于安全分析应用程序的元数据。与传统的应用分配机制不同,Android应用程序在移动市场中集中分布。因此,除了应用软件包外,这些市场包含应用程序开发者和应用程序用户提供的应用程序信息。这些有用的文本数据以及用于处理和理解文本数据的自然语言处理的进步,鼓励研究人员调查在Android安全方面使用NLP技术的情况。特别是,基于NLP的安全解决方案在过去五年中加快了速度,并证明是有用的。本研究审查了这些建议,目的是探索未来研究的可能研究方向,介绍该领域的最新技术。我们主要侧重于基于以下四类的NLP解决方案:描述对描述的忠诚、描述生成、隐私和恶意检测。