To protect users' privacy, legislators have regulated the usage of tracking technologies, mandating the acquisition of users' consent before collecting data. Consequently, websites started showing more and more consent management modules -- i.e., Privacy Banners -- the visitors have to interact with to access the website content. They challenge the automatic collection of Web measurements, primarily to monitor the extensiveness of tracking technologies but also to measure Web performance in the wild. Privacy Banners in fact limit crawlers from observing the actual website content. In this paper, we present a thorough measurement campaign focusing on popular websites in Europe and the US, visiting both landing and internal pages from different countries around the world. We engineer Priv-Accept, a Web crawler able to accept the privacy policies, as most users would do in practice. This let us compare how webpages change before and after. Our results show that all measurements performed not dealing with the Privacy Banners offer a very biased and partial view of the Web. After accepting the privacy policies, we observe an increase of up to 70 trackers, which in turn slows down the webpage load time by a factor of 2x-3x.
翻译:为了保护用户的隐私,立法者对跟踪技术的使用进行了监管,规定在收集数据之前取得用户的同意。因此,网站开始显示越来越多的同意管理模块,即隐私银行,访问者必须进行互动才能访问网站内容。他们质疑自动收集网络测量,主要是为了监测跟踪技术的广泛性,但也是为了测量野外的网络性能。隐私银行事实上限制爬行者观察实际网站内容。在本文中,我们介绍了一个全面的测量运动,重点是欧洲和美国受欢迎的网站,访问来自世界各国的登陆和内部网页。我们设计了Priv-Accept,这是一个能够接受隐私政策的网络浏览器,正如大多数用户在实践中会做的那样。让我们比较网页在前后的变化情况。我们的结果表明,所有与隐私银行打交道的测量都提供了一种非常偏差和部分的网络浏览。在接受隐私政策后,我们观察到了70个跟踪器的增加,这反过来使网页的负载时间减慢了2x-3x的系数。