Decentralized data storage systems like the Interplanetary Filesystem (IPFS) are becoming increasingly popular, e. g., as a data layer in blockchain applications and for sharing content in a censorship-resistant manner. In IPFS, data is hosted by an open set of peers, requests to which are broadcast to all directly connected peers and routed via a distributed hash table (DHT). In this paper, we showcase how the monitoring of said data requests allows for profound insights about the IPFS network while simultaneously breaching individual users' privacy. To this end, we present a passive monitoring methodology that enables us to collect data requests of a significant and upscalable portion of the total IPFS node population. Using a measurement setup implementing our approach and data collected over a period of fifteen months, we demonstrate the estimation of, among other things: the size of the IPFS network, activity levels and structure, and content popularity distributions. We furthermore present how our methodology can be abused for attacks on users' privacy. As a demonstration, we identify and successfully surveil public IPFS/HTTP gateways, thereby also uncovering their (normally hidden) node identifiers. We find that the number of requests by public gateways is substantial, suggesting substantial usage of these gateways. We give a detailed analysis of the mechanics and reasons behind implied privacy threats and discuss possible countermeasures.
翻译:例如,作为连锁应用中的数据层和以抵制审查的方式分享内容的数据层,在森林小组和森林论坛中,数据由一组公开的同龄人提供,向所有直接相连的同龄人广播,并通过散散散散散散表(DHT)提供。在本文件中,我们展示了监测上述数据请求如何使人们能够深刻了解森林小组和系统网络,同时侵犯个别用户的隐私。为此目的,我们提出了一个被动的监测方法,使我们能够收集大量和可升级的森林小组和工作队总节点人口的数据请求,利用一套衡量办法,落实我们在15个月内收集的方法和数据,我们展示了对森林小组和工作队网络的规模、活动水平和结构以及内容的普及分布的估计。我们进一步介绍了如何滥用我们的方法攻击用户隐私。作为示范,我们查明并成功探测了森林小组/工作队公共网关,从而也揭示了它们(通常隐藏的)重要和可扩展部分的数据请求,从而揭示了它们(通常隐藏的)方法和在15个月内收集的数据,我们展示了各种隐秘的网路面,我们找到了这些可能的隐秘性工具。