Recent advancements in artificial intelligence (AI) and its widespread integration into mobile software applications have received significant attention, highlighting the growing prominence of AI capabilities in modern software systems. However, the inherent hallucination and reliability issues of AI continue to raise persistent concerns. Consequently, application users and regulators increasingly ask critical questions such as: Does the application incorporate AI capabilities? and What specific types of AI functionalities are embedded? Preliminary efforts have been made to identify AI capabilities in mobile software; however, existing approaches mainly rely on manual inspection and rule-based heuristics. These methods are not only costly and time-consuming but also struggle to adapt advanced AI techniques. To address the limitations of existing methods, we propose LLMAID (Large Language Model for AI Discovery). LLMAID includes four main tasks: (1) candidate extraction, (2) knowledge base interaction, (3) AI capability analysis and detection, and (4) AI service summarization. We apply LLMAID to a dataset of 4,201 Android applications and demonstrate that it identifies 242% more real-world AI apps than state-of-the-art rule-based approaches. Our experiments show that LLM4AID achieves high precision and recall, both exceeding 90%, in detecting AI-related components. Additionally, a user study indicates that developers find the AI service summaries generated by LLMAID to be more informative and preferable to the original app descriptions. Finally, we leverage LLMAID to perform an empirical analysis of AI capabilities across Android apps. The results reveal a strong concentration of AI functionality in computer vision (54.80%), with object detection emerging as the most common task (25.19%).
翻译:人工智能(AI)领域的最新进展及其在移动软件应用中的广泛集成已引起广泛关注,突显了AI能力在现代软件系统中日益增长的重要性。然而,AI固有的幻觉与可靠性问题仍持续引发担忧。因此,应用用户与监管机构越来越多地提出关键问题:该应用是否集成了AI能力?以及嵌入了何种具体类型的AI功能?已有初步研究致力于识别移动软件中的AI能力,但现有方法主要依赖人工检查与基于规则的启发式策略。这些方法不仅成本高昂、耗时,且难以适应先进的AI技术。为克服现有方法的局限,我们提出了LLMAID(基于大型语言模型的AI发现框架)。LLMAID包含四项主要任务:(1)候选提取,(2)知识库交互,(3)AI能力分析与检测,以及(4)AI服务摘要生成。我们将LLMAID应用于包含4,201个Android应用的数据集,结果表明其识别的真实AI应用数量比最先进的基于规则方法多242%。实验显示,LLMAID在检测AI相关组件时实现了高精度与高召回率,两者均超过90%。此外,一项用户研究表明,开发者认为LLMAID生成的AI服务摘要比原始应用描述更具信息量且更受青睐。最后,我们利用LLMAID对Android应用中的AI能力进行了实证分析。结果显示,AI功能高度集中于计算机视觉领域(54.80%),其中目标检测成为最常见的任务(25.19%)。