Contemporary mobile applications (apps) are designed to track, use, and share users data, often without their consent, which result in potential privacy and transparency issues. To investigate whether mobile apps are transparent about the collect information about users and apps comply with their privacy policies, we performed longitudinal analysis of the different versions of 268 Android applications comprising 5,240 app releases or versions between 2008 and 2016. We detect inconsistencies between apps' behaviors and stated use of data collection, to reveal compliance issues. We utilize machine learning techniques to classify the privacy policy text to identify the purported practices that collect and/or share users' personal information such as phone numbers and email addresses. We then uncover the actual data leaks of an app through static and dynamic analysis. Over time, our results show a steady increase in the overall number of apps' data collection practices that are undisclosed in the privacy policies. This is particularly troubling since privacy policy is the primary tool for describing the app's privacy protection practices. We find that newer versions of the apps are likely to be more non-compliant than their preceding versions. The discrepancies between the purported and actual data practices show that privacy policies are often incoherent with the apps' behaviors, thus defying the `notice and choice' principle when users install apps.
翻译:现代移动应用程序(应用程序)旨在跟踪、使用和分享用户数据,往往未经用户同意,从而导致潜在的隐私和透明度问题。为了调查移动应用程序对于收集用户和应用程序的信息是否透明,我们对2008年至2016年期间由5,240个应用程序发布或版本组成的268 Android应用程序的不同版本进行了纵向分析。我们发现应用程序行为与公开使用数据收集做法之间存在不一致之处,以揭示合规问题。我们利用机器学习技术对隐私政策文本进行分类,以查明已知的收集和/或分享用户个人信息的做法,例如电话号码和电子邮件地址。然后通过静态和动态分析发现应用程序的实际数据泄漏。随着时间的推移,我们的结果显示,在隐私政策中未披露的应用程序数据收集做法总量稳步增加。这尤其令人不安,因为隐私政策是描述应用程序隐私保护做法的主要工具。我们发现,较新的应用程序版本可能比先前版本更加不合规。我们发现,我们随后通过静态和动态分析发现,我们发现应用程序的实际数据做法中的实际数据泄漏了应用程序数据。我们发现,在隐私政策中往往不尊重用户的行为。