跨西-西-西-联邦学习在促进农业食品部门数据分享方面的作用 (The Role of Cross-Silo Federated Learning in Facilitating Data Sharing in the Agri-Food Sector)

Data sharing remains a major hindering factor when it comes to adopting emerging AI technologies in general, but particularly in the agri-food sector. Protectiveness of data is natural in this setting; data is a precious commodity for data owners, which if used properly can provide them with useful insights on operations and processes leading to a competitive advantage. Unfortunately, novel AI technologies often require large amounts of training data in order to perform well, something that in many scenarios is unrealistic. However, recent machine learning advances, e.g. federated learning and privacy-preserving technologies, can offer a solution to this issue via providing the infrastructure and underpinning technologies needed to use data from various sources to train models without ever sharing the raw data themselves. In this paper, we propose a technical solution based on federated learning that uses decentralized data, (i.e. data that are not exchanged or shared but remain with the owners) to develop a cross-silo machine learning model that facilitates data sharing across supply chains. We focus our data sharing proposition on improving production optimization through soybean yield prediction, and provide potential use-cases that such methods can assist in other problem settings. Our results demonstrate that our approach not only performs better than each of the models trained on an individual data source, but also that data sharing in the agri-food sector can be enabled via alternatives to data exchange, whilst also helping to adopt emerging machine learning technologies to boost productivity.

翻译：在采用新兴的AI技术时,数据共享仍然是主要的障碍因素,特别是在农业食品部门。数据保护性在这种环境下是自然而然的;数据是数据拥有者的宝贵商品,如果使用得当,数据拥有者可以使他们对导致竞争优势的操作和过程有有益的洞察力。不幸的是,新的AI技术往往需要大量的培训数据才能很好地运行,这在很多情况下都是不切实际的。然而,最近的机器学习进展,例如联合学习和隐私保护技术,可以通过提供各种来源的数据所需的基础设施和基础技术来解决这一问题,以培训模型,而无需自己分享原始数据。在本文件中,我们提出一个技术解决方案,以使用分散的数据(即没有交换或分享但仍然与所有者分享的数据)为基础,以开发一个跨筒机学习模型,便利整个供应链的数据共享。我们的数据共享建议的重点是通过soybean收益预测来改进生产优化,并提供潜在的使用案例,使这些方法能够帮助其他问题环境中的模型。我们提出的一个技术解决方案是以联合学习为基础的技术解决方案。我们的成果还表明,在每一个经过培训的农业技术中,我们的数据交流中,只能采用一种更好的数据来源。

相关内容

联邦学习

关注 199

联邦学习（Federated Learning）是一种新兴的人工智能基础技术，在 2016 年由谷歌最先提出，原本用于解决安卓手机终端用户在本地更新模型的问题，其设计目标是在保障大数据交换时的信息安全、保护终端数据和个人数据隐私、保证合法合规的前提下，在多参与方或多计算结点之间开展高效率的机器学习。其中，联邦学习可使用的机器学习算法不局限于神经网络，还包括随机森林等重要算法。联邦学习有望成为下一代人工智能协同算法和协作网络的基础。

【如何做研究】How to research ，22页ppt

专知会员服务

113+阅读 · 2021年4月17日