Unlike IPv4 addresses, which are typically masked by a NAT, IPv6 addresses could easily be correlated with user activity, endangering their privacy. Mitigations to address this privacy concern have been deployed, making existing approaches for address-to-user correlation unreliable. This work demonstrates that an adversary could still correlate IPv6 addresses with users accurately, even with these protection mechanisms. To do this, we propose an IPv6 address correlation model - SiamHAN. The model uses a Siamese Heterogeneous Graph Attention Network to measure whether two IPv6 client addresses belong to the same user even if the user's traffic is protected by TLS encryption. Using a large real-world dataset, we show that, for the tasks of tracking target users and discovering unique users, the state-of-the-art techniques could achieve only 85% and 60% accuracy, respectively. However, SiamHAN exhibits 99% and 88% accuracy.
翻译:IPv4 地址通常被NAT掩盖, IPv6 地址与IPv4 地址不同, IPv4 地址很容易与用户活动发生关联, 从而危及他们的隐私。 为解决这一隐私问题, 已经采取了缓解措施, 使得现有的地址与用户之间关联的方法不可靠。 这项工作表明, 即使与这些保护机制, 对手仍然可以与用户准确连接 IPv6 地址。 为了做到这一点, 我们建议使用 IPv6 地址相关模式 - SiamHAN。 该模型使用 siamese Hetergeneous 图形注意网络来衡量两个 IPv6 客户地址是否属于同一用户, 即使用户的流量受到 TLS 加密的保护。 我们用大型真实世界数据集显示, 对于跟踪目标用户和发现独特用户的任务, 最新技术可以分别达到85% 和 60% 的精确度。 然而, siamHAN 显示 99% 和 88% 的精确度 。