Recommender systems have shown to be a successful representative of how data availability can ease our everyday digital life. However, data privacy is one of the most prominent concerns in the digital era. After several data breaches and privacy scandals, the users are now worried about sharing their data. In the last decade, Federated Learning has emerged as a new privacy-preserving distributed machine learning paradigm. It works by processing data on the user device without collecting data in a central repository. We present FedeRank (https://split.to/federank), a federated recommendation algorithm. The system learns a personal factorization model onto every device. The training of the model is a synchronous process between the central server and the federated clients. FedeRank takes care of computing recommendations in a distributed fashion and allows users to control the portion of data they want to share. By comparing with state-of-the-art algorithms, extensive experiments show the effectiveness of FedeRank in terms of recommendation accuracy, even with a small portion of shared user data. Further analysis of the recommendation lists' diversity and novelty guarantees the suitability of the algorithm in real production environments.
翻译:建议者系统已经证明,数据提供如何方便我们日常数字生活的成功代表了数据提供;然而,数据隐私是数字时代最突出的关注问题之一;经过几次数据破坏和隐私丑闻后,用户现在对共享数据感到担忧;在过去的十年中,联邦学习联合会已经成为一个新的隐私保护分布式机器学习模式;它通过处理用户设备的数据而不在中央储存库收集数据而运作;我们介绍了FedeRank(https://split.to/federank),这是一份联合建议算法;该系统在每个设备上都学习了个人因子化模型;该模型的培训是中央服务器和联邦客户之间的同步过程;FedeRank以分布式处理计算建议,使用户能够控制他们想要分享的部分数据;与最新算法相比,广泛的实验显示FedeRank在建议准确性方面的有效性,即使共享的用户数据只有一小部分;进一步分析建议清单的多样性和新颖的保证算法在实际生产环境中的合适性。