Various data-sharing platforms have emerged with the growing public demand for open data and legislation mandating certain data to remain open. Most of these platforms remain opaque, leading to many questions about data accuracy, provenance and lineage, privacy implications, consent management, and the lack of fair incentives for data providers. With their transparency, immutability, non-repudiation, and decentralization properties, blockchains could not be more apt to answer these questions and enhance trust in a data-sharing platform. However, blockchains are not good at handling the four Vs of big data (i.e., volume, variety, velocity, and veracity) due to their limited performance, scalability, and high cost. Given many related works proposes blockchain-based trustworthy data-sharing solutions, there is increasing confusion and difficulties in understanding and selecting these technologies and platforms in terms of their sharing mechanisms, sharing services, quality of services, and applications. In this paper, we conduct a comprehensive survey on blockchain-based data-sharing architectures and applications to fill the gap. First, we present the foundations of blockchains and discuss the challenges of current data-sharing techniques. Second, we focus on the convergence of blockchain and data sharing to give a clear picture of this landscape and propose a reference architecture for blockchain-based data sharing. Third, we discuss the industrial applications of blockchain-based data sharing, ranging from healthcare and smart grid to transportation and decarbonization. For each application, we provide lessons learned for the deployment of Blockchain-based data sharing. Finally, we discuss research challenges and open research directions.
翻译:随着公众日益要求开放数据和立法,要求某些数据保持开放,出现了各种数据分享平台,公众对开放数据和立法的需求日益增长,使某些数据保持开放;这些平台大多仍然不透明,导致数据准确性、来源和来源、隐私影响、同意管理以及数据提供者缺乏公平激励等许多问题;由于这些平台的透明度、可移动性、不透视性和权力下放特性,这些链条无法更方便地回答这些问题和增强对数据分享平台的信任;然而,由于它们绩效、可扩展性、速度和真实性有限,导致数据准确性、来源和来源、隐私、同意管理和数据提供者缺乏公平性激励;由于这些平台的透明度、不易移动性、不易变异性以及缺乏对数据分享机制的理解和选择、共享服务、服务质量和应用的透明度,这些链链条对于处理大数据(即数量、多样性、速度和真实性)四维大数据(即数量、可扩展性、速度和真实性)问题仍然不透明;鉴于许多相关工作提出了基于链路、基于链、可扩展性的研究、可扩展性、可扩展性、以及高成本等问题的许多问题,因此,在理解和选择这些技术分享机制分享机制、共享机制、分享方面越来越难以理解、数据结构交流方面,我们讨论数据分享的每个数据链路路段、讨论数据交换、讨论数据交换、数据结构交流、讨论数据结构、讨论数据交换、讨论数据交流、讨论数据结构交流、最后结构、讨论数据结构、讨论数据结构、讨论数据结构、数据结构、讨论数据分享的难度和结构结构交流的难度、讨论。</s>