We propose OmniLytics, a blockchain-based secure data trading marketplace for machine learning applications. Utilizing OmniLytics, many distributed data owners can contribute their private data to collectively train an ML model requested by some model owners, and receive compensation for data contribution. OmniLytics enables such model training while simultaneously providing 1) model security against curious data owners; 2) data security against the curious model and data owners; 3) resilience to malicious data owners who provide faulty results to poison model training; and 4) resilience to malicious model owners who intend to evade payment. OmniLytics is implemented as a blockchain smart contract to guarantee the atomicity of payment. In OmniLytics, a model owner splits its model into the private and public parts and publishes the public part on the contract. Through the execution of the contract, the participating data owners securely aggregate their locally trained models to update the model owner's public model and receive reimbursement through the contract. We implement a working prototype of OmniLytics on Ethereum blockchain and perform extensive experiments to measure its gas cost, execution time, and model quality under various parameter combinations. For training a CNN on the MNIST dataset, the MO is able to boost its model accuracy from 62% to 83% within 500ms in blockchain processing time.This demonstrates the effectiveness of OmniLytics for practical deployment.
翻译:我们提议OmniLytics,这是一个基于安全链的安全数据交易市场,用于机器学习应用。利用OmniLytics,许多分布式数据所有者可以贡献其私人数据,以集体培训一些模型拥有者所要求的ML模型,并获得数据贡献补偿。OmniLytics使得这种示范培训能够同时提供1个模型,同时提供针对好奇数据拥有者的模型安全性;(2)数据安全,以对抗好奇模型和数据拥有者;(3)对恶意数据拥有者提供毒药模型培训错误结果的复原力;(4)对恶意模型拥有者提供抗御能力,以逃避付款。全套智能合同是作为保证支付原子性的链智能合同实施的。在OmniLytics中,一个模型所有人将其模型分为私人和公共部分,并公布合同中的公共部分。通过执行合同,参与数据所有者安全地汇集了他们经过当地培训的模型,以更新模型所有者的公共模型,并通过合同获得补偿。我们在Eienummelummlitics 链链链中安装一个工作原型模型,以测量其天然气成本、执行时间和模型的质量,在各种参数组合中,在MMNIMIMSlislislxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx。