Web3 aims at creating a decentralized platform that is competitive with modern cloud infrastructures that support today's Internet. However, Web3 is still limited, supporting only applications in the domains of content creation and sharing, decentralized financing, and decentralized communication. This is mainly due to the technologies supporting Web3: blockchain, IPFS, and libp2p, that although provide a good collection of tools to develop Web3 applications, are still limited in terms of design and performance. This motivates the need to better understand these technologies as to enable novel optimizations that can push Web3 to its full potential. Unfortunately, understanding the current behavior of a fully decentralized large-scale distributed system is a difficult task, as there is no centralized authority that has full knowledge of the system operation. To this end, in this paper we characterize the workload of IPFS, a key enabler of Web3. To achieve this, we have collected traces from accesses performed by users to one of the most popular IPFS gateways located in North America for a period of two weeks. Through the fine analysis of these traces, we gathered the amount of requests to the system, and found the providers of the requested content. With this data, we characterize both the popularity of requested and provided content, as well as their geo-location (by matching IP address with the MaxMind database). Our results show that most of the requests in IPFS are only to a few different content, that is provided by large portion of peers in the system. Furthermore, our analysis also shows that most requests are provided by the two largest portions of providers in the system, located in North America and Europe. With these insights, we conclude that the current IPFS architecture is sub-optimal and propose a research agenda for the future.
翻译:Web3 旨在创建一个与支持当今互联网的现代云层基础设施具有竞争力的分散化平台。然而,Web3仍然有限,仅支持内容创建和共享、分散融资和分散通信等领域的应用。这主要是由于支持Web3的技术:链链、IPSS和lipp2p,虽然为开发Web3应用程序提供了很好的工具,但在设计和性能方面仍然有限。这促使人们需要更好地理解这些技术,以便进行新的优化,将Web3推向全部潜力。不幸的是,了解一个完全分散的大规模分布系统目前的行为是一项艰巨的任务,因为没有一个对系统运作拥有充分知识的中央权威。为此,我们在本文中描述GIPS的工作量,这是网络的关键推动者之一。为了实现这一点,我们从用户对位于北美最受欢迎的IPS网门之一的访问中收集到的痕迹,为期两周。通过对这些线索的细微分析,我们收集了系统对系统请求的数量,发现目前大规模分布系统内容的提供者是困难的,并且根据我们要求的IPS网站中的大部分内容,我们用这个数据库显示的是我们所要求的大部分数据。