全图:分散式机械学习的基于链式安全数据市场 (OmniLytics: A Blockchain-based Secure Data Market for Decentralized Machine Learning)

from arxiv, An initial version of the article has been published in International Workshop on Federated Learning for User Privacy and Data Confidentiality in Conjunction with ICML 2021(http://federated-learning.org/fl-icml-2021/). This version has been submmited to AAAI'22

We propose OmniLytics, a blockchain-based secure data trading marketplace for machine learning applications. Utilizing OmniLytics, many distributed data owners can contribute their private data to collectively train an ML model requested by some model owners, and receive compensation for data contribution. OmniLytics enables such model training while simultaneously providing 1) model security against curious data owners; 2) data security against the curious model and data owners; 3) resilience to malicious data owners who provide faulty results to poison model training; and 4) resilience to malicious model owners who intend to evade payment. OmniLytics is implemented as a blockchain smart contract to guarantee the atomicity of payment. In OmniLytics, a model owner splits its model into the private and public parts and publishes the public part on the contract. Through the execution of the contract, the participating data owners securely aggregate their locally trained models to update the model owner's public model and receive reimbursement through the contract. We implement a working prototype of OmniLytics on Ethereum blockchain and perform extensive experiments to measure its gas cost, execution time, and model quality under various parameter combinations. For training a CNN on the MNIST dataset, the MO is able to boost its model accuracy from 62% to 83% within 500ms in blockchain processing time.This demonstrates the effectiveness of OmniLytics for practical deployment.

翻译：我们提议OmniLytics,这是一个基于安全链的安全数据交易市场,用于机器学习应用。利用OmniLytics,许多分布式数据所有者可以贡献其私人数据,以集体培训一些模型拥有者所要求的ML模型,并获得数据贡献补偿。OmniLytics使得这种示范培训能够同时提供1个模型,同时提供针对好奇数据拥有者的模型安全性;(2)数据安全,以对抗好奇模型和数据拥有者;(3)对恶意数据拥有者提供毒药模型培训错误结果的复原力;(4)对恶意模型拥有者提供抗御能力,以逃避付款。全套智能合同是作为保证支付原子性的链智能合同实施的。在OmniLytics中,一个模型所有人将其模型分为私人和公共部分,并公布合同中的公共部分。通过执行合同,参与数据所有者安全地汇集了他们经过当地培训的模型,以更新模型所有者的公共模型,并通过合同获得补偿。我们在Eienummelummlitics 链链链中安装一个工作原型模型,以测量其天然气成本、执行时间和模型的质量,在各种参数组合中,在MMNIMIMSlislislxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/