Companies that have an online presence-in particular, companies that are exclusively digital-often subscribe to this business model: collect data from the user base, then expose the data to advertisement agencies in order to turn a profit. Such companies routinely market a service as "free", while obfuscating the fact that they tend to "charge" users in the currency of personal information rather than money. However, online companies also gather user data for more principled purposes, such as improving the user experience and aggregating statistics. The problem is the sale of user data to third parties. In this work, we design an intelligent approach to online privacy protection that leverages supervised learning. By detecting and blocking data collection that might infringe on a user's privacy, we can restore a degree of digital privacy to the user. In our evaluation, we collect a dataset of network requests and measure the performance of several classifiers that adhere to the supervised learning paradigm. The results of our evaluation demonstrate the feasibility and potential of our approach.
翻译:在线服务公司经常通过收集用户数据并将其向广告机构披露来获取利润。他们将服务吹嘘成“免费”,而隐瞒了实际上以个人信息而非货币收费。然而,线上公司也会收集用户数据以改进用户体验和汇总统计数据。
本研究设计了一种智能的在线隐私保护方法,利用监督学习来检测和阻止可能侵犯用户隐私的数据收集,为用户恢复了数字隐私的一定程度。在评估中,我们收集了网络请求的数据集,并衡量了几个遵循监督学习范式的分类器的性能。评估结果证明了我们方法的可行性和潜力。