与隐私知识学习者一起开展隐私保护启动深强化学习 (Privacy-Preserving Kickstarting Deep Reinforcement Learning with Privacy-Aware Learners)

Kickstarting deep reinforcement learning algorithms facilitate a teacher-student relationship among the agents and allow for a well-performing teacher to share demonstrations with a student to expedite the student's training. However, despite the known benefits, the demonstrations may contain sensitive information about the teacher's training data and existing kickstarting methods do not take any measures to protect it. Therefore, we use the framework of differential privacy to develop a mechanism that securely shares the teacher's demonstrations with the student. The mechanism allows for the teacher to decide upon the accuracy of its demonstrations with respect to the privacy budget that it consumes, thereby granting the teacher full control over its data privacy. We then develop a kickstarted deep reinforcement learning algorithm for the student that is privacy-aware because we calibrate its objective with the parameters of the teacher's privacy mechanism. The privacy-aware design of the algorithm makes it possible to kickstart the student's learning despite the perturbations induced by the privacy mechanism. From numerical experiments, we highlight three empirical results: (i) the algorithm succeeds in expediting the student's learning, (ii) the student converges to a performance level that was not possible without the demonstrations, and (iii) the student maintains its enhanced performance even after the teacher stops sharing useful demonstrations due to its privacy budget constraints.

翻译：启动深层强化学习算法,可以促进代理人之间的师生关系,并使优秀教师能够与学生分享示范活动,以加快学生培训。然而,尽管已知的好处,示范活动可能包含有关教师培训数据的敏感信息,而现有的启动方法没有采取任何措施来保护这些数据。因此,我们利用差异隐私权框架开发一个机制,安全地与学生分享教师的示威活动。这个机制允许教师就其所消费的隐私预算的示威准确性作出决定,从而给予教师充分控制其数据隐私。然后,我们为意识到隐私的学生开发了一个启动的深层强化学习算法,因为我们根据教师隐私机制的参数调整了其目标,而现有的启动方法并没有采取任何措施来保护这些数据。我们利用差异隐私权框架开发了一个机制,可以启动学生与学生的学习,尽管隐私机制引发了干扰。从数字实验中,我们强调了三个经验结果:(一) 算法成功地加快了学生学习,(二) 学生的隐私感知觉的深度强化学习算法,因为我们根据教师的隐私机制调整了它的目标。 (三) 在没有提高学生的演示后,学生的成绩限制下,学生的保密程度是无法保持。

相关内容

Kickstarter

关注 0

Kickstarter 是一个於 2009 年在美国纽约成立、基于美国人的盈利性公司，它通过该网站进行公众集资以提供人们进行创意项目的筹集资金。Kickstarter 可以许多种创意项目募集资金，譬如电影、音乐、舞台剧、漫画、新闻学、电视游戏以及与食物有关的项目。但人们不能以 Kickstarter 为投资项目来赚钱。他们规定只能返还实物奖励或者独一无二的经验给资助者，像一本写着感谢的笔记、定制的T恤、与作家共进晚餐，或者一个新产品的最初体验。 via 维基百科 | Kickstarter

【伯克利】机器学习蛋白质工程，Machine learning for protein engineering，83页ppt

专知会员服务

36+阅读 · 2020年5月9日

【伯克利】元学习的元基线，A New Meta-Baseline for Few-Shot Learning

专知会员服务

67+阅读 · 2020年3月28日