Domain name encryptions (DoTH and ESNI) have been proposed to improve security and privacy while browsing the web. Although the security benefit is clear, the positive impact on user privacy is still questionable. Given that the mapping between domains and their hosting IPs can be easily obtained, the websites a user visits can still be inferred by a network-level observer based on the destination IPs of user connections. However, content delivery networks, DNS-based load balancing, co-hosting of different websites on the same server, and IP churn, all contribute towards making domain-IP mappings unstable, and prevent straightforward IP-based browsing tracking for the majority of websites. We show that this instability is not a roadblock for browsing tracking (assuming a universal DoTH and ESNI deployment), by introducing an IP-based fingerprinting technique that allows a network-level observer to identify the website a user visits with high accuracy, based solely on the IP address information obtained from the encrypted traffic. Our technique exploits the complex structure of most websites, which load resources from several domains besides their own primary domain. We extract the domains contacted while browsing 220K websites to construct domain-based fingerprints. Each domain-based fingerprint is then converted to an IP-based fingerprint by periodically performing DNS lookups. Using the generated fingerprints, we could successfully identify 91% of the websites when observing solely destination IPs. We also evaluated the fingerprints' robustness over time, and demonstrate that they are still effective at identifying 70% of the tested websites after two months. We conclude by discussing strategies for website owners and hosting providers to hinder IP-based website fingerprinting and maximize the privacy benefits offered by domain name encryption.
翻译:域名加密( DATH 和 ESNI ) 已被提议在浏览网络的同时改善安全和隐私。 虽然安全收益显而易见, 但用户隐私的正面影响仍然值得怀疑。 鉴于域与主机IP之间的映射可以轻松获得, 网站用户访问仍可以由基于用户连接目的IP的网络级观察员推断。 然而, 内容交付网络、 基于 DNS 的负载平衡、 在同一服务器和 IP 中共同托管不同网站, 都有助于使域名IP 映射工作不稳定, 并防止对大多数网站进行直接的 IP 浏览跟踪。 我们显示, 这种不稳定不是浏览跟踪的路障( 假设通用 DATH 和 ESNI 部署), 其方法是引入基于 IP IP 的指纹定位技术, 仅以加密的 IP 地址信息为基础访问用户访问。 我们的技术只是利用大多数网站的复杂结构, 将资源从多个主域域域进行存储, 而不是以自己的主域名 。 我们利用连接的域名在浏览网站后, 将浏览网站的域名转换为 220 。