强化学习建议系统中的用户设置 (User Tampering in Reinforcement Learning Recommender Systems) - 专知论文

会员服务 ·

0

推荐系统 · 学成 · 强化学习 · 相互独立的 · 设计 ·

2021 年 9 月 9 日

User Tampering in Reinforcement Learning Recommender Systems

翻译：强化学习建议系统中的用户设置

Charles Evans,Atoosa Kasirzadeh

from arxiv, Accepted for presentation at the 4th FAccTRec Workshop on Responsible Recommendation (FAccTRec '21)

This paper provides the first formalisation and empirical demonstration of a particular safety concern in reinforcement learning (RL)-based news and social media recommendation algorithms. This safety concern is what we call "user tampering" -- a phenomenon whereby an RL-based recommender system may manipulate a media user's opinions, preferences and beliefs via its recommendations as part of a policy to increase long-term user engagement. We provide a simulation study of a media recommendation problem constrained to the recommendation of political content, and demonstrate that a Q-learning algorithm consistently learns to exploit its opportunities to 'polarise' simulated 'users' with its early recommendations in order to have more consistent success with later recommendations catering to that polarisation. Finally, we argue that given our findings, designing an RL-based recommender system which cannot learn to exploit user tampering requires making the metric for the recommender's success independent of observable signals of user engagement, and thus that a media recommendation system built solely with RL is necessarily either unsafe, or almost certainly commercially unviable.

翻译：本文首次正式和实证地展示了强化学习(RL)基于新闻和社会媒体建议算法中特别的安全关切。这种安全关切是我们所谓的“用户篡改”现象,即基于RL的建议系统可以通过其建议来操纵媒体用户的意见、偏好和信仰,以此作为增加长期用户参与的政策的一部分。我们模拟研究了受政治内容建议限制的媒体建议问题,并表明一个Q-学习算法一贯学习如何利用它的机会,利用它早期的建议“Polarise”模拟用户,以便更一致地成功落实后来关于两极化的建议。最后,我们指出,根据我们的调查结果,设计基于RL的建议系统,无法利用用户篡改的系统,需要根据用户参与的可观察信号,对建议者的成功进行衡量,因此,仅与RL建立的媒体建议系统必然不安全,或几乎在商业上是行不通的。

0

相关内容

推荐系统

推荐系统，是指根据用户的习惯、偏好或兴趣，从不断到来的大规模信息中识别满足用户兴趣的信息的过程。推荐推荐任务中的信息往往称为物品(Item)。根据具体应用背景的不同，这些物品可以是新闻、电影、音乐、广告、商品等各种对象。推荐系统利用电子商务网站向客户提供商品信息和建议，帮助用户决定应该购买什么产品，模拟销售人员帮助客户完成购买过程。个性化推荐是根据用户的兴趣特点和购买行为，向用户推荐用户感兴趣的信息和商品。随着电子商务规模的不断扩大，商品个数和种类快速增长，顾客需要花费大量的时间才能找到自己想买的商品。这种浏览大量无关的信息和产品过程无疑会使淹没在信息过载问题中的消费者不断流失。为了解决这些问题，个性化推荐系统应运而生。个性化推荐系统是建立在海量数据挖掘基础上的一种高级商务智能平台，以帮助电子商务网站为其顾客购物提供完全个性化的决策支持和信息服务。

知识荟萃

精品入门和进阶教程、论文和代码整理等

更多

查看相关VIP内容、论文、资讯等

【KDD 2020】M2GRL: 一个多任务多视角图表示学习框架的Web-scale的推荐系统，M2GRL: A Multi-task Multi-view Graph Representation Learning Framework for Web-scale Recommender Systems

【KDD 2020】M2GRL: 一个多任务多视角图表示学习框架的Web-scale的推荐系统，M2GRL: A Multi-task Multi-view Graph Representation Learning Framework for Web-scale Recommender Systems

专知会员服务

29+阅读 · 2020年6月30日

【序列推荐系统:挑战、进展和展望】Sequential Recommender Systems

【序列推荐系统:挑战、进展和展望】Sequential Recommender Systems

专知会员服务

82+阅读 · 2020年4月25日

【干货书】实战推荐系统，Practical Recommender Systems，432页pdf

【干货书】实战推荐系统，Practical Recommender Systems，432页pdf

专知会员服务

180+阅读 · 2020年4月17日

【WWW2020】解决推荐系统中目标客户失真问题，Addressing the Target Customer Distortion Problem in Recommender Systems

【WWW2020】解决推荐系统中目标客户失真问题，Addressing the Target Customer Distortion Problem in Recommender Systems

专知会员服务

10+阅读 · 2020年4月4日

【AAAI2020教程】强化学习中的Exploration-Exploitation in Reinforcement Learning

专知会员服务

101+阅读 · 2020年2月8日

【论文推荐WWW2020-UIUC】修正排序系统中的选择偏差：Correcting for Selection Bias in Learning-to-rank Systems

【论文推荐WWW2020-UIUC】修正排序系统中的选择偏差：Correcting for Selection Bias in Learning-to-rank Systems

专知会员服务

32+阅读 · 2020年2月1日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【AAMSA 2019 | tutorial】多智能体系统中的认知推理Epistemic Reasoning In Multiagent Systems ,法国雷恩François Schwarzentruber

【AAMSA 2019 | tutorial】多智能体系统中的认知推理Epistemic Reasoning In Multiagent Systems ,法国雷恩François Schwarzentruber

专知会员服务

24+阅读 · 2019年5月14日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

LibRec 精选：位置感知的长序列会话推荐

LibRec 精选：位置感知的长序列会话推荐

LibRec智能推荐

3+阅读 · 2019年5月17日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec智能推荐

50+阅读 · 2018年8月27日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

推荐｜深度强化学习聊天机器人（附论文）！

推荐｜深度强化学习聊天机器人（附论文）！

全球人工智能

4+阅读 · 2018年1月30日

D2RLIR : an improved and diversified ranking function in interactive recommendation systems based on deep reinforcement learning

D2RLIR : an improved and diversified ranking function in interactive recommendation systems based on deep reinforcement learning

Arxiv

0+阅读 · 2021年10月29日

A Survey on Reinforcement Learning for Recommender Systems

Arxiv

22+阅读 · 2021年9月22日

A Survey of Deep Reinforcement Learning in Recommender Systems: A Systematic Review and Future Directions

Arxiv

14+阅读 · 2021年9月8日

Learning Recommender Systems from Multi-Behavior Data

Learning Recommender Systems from Multi-Behavior Data

Arxiv

7+阅读 · 2018年11月29日

Deep Reinforcement Learning for Page-wise Recommendations

Arxiv

8+阅读 · 2018年5月7日

Human Interaction with Recommendation Systems

Arxiv

6+阅读 · 2018年3月28日

Learning Recommendations While Influencing Interests

Arxiv

9+阅读 · 2018年3月23日

Sequence-Aware Recommender Systems

Arxiv

8+阅读 · 2018年2月23日

Reinforcement Learning based Recommender System using Biclustering Technique

Arxiv

5+阅读 · 2018年1月17日

Deep Reinforcement Learning for List-wise Recommendations

Arxiv

13+阅读 · 2018年1月5日

VIP会员

文章信息

相关主题

相互独立的

相关VIP内容

【KDD 2020】M2GRL: 一个多任务多视角图表示学习框架的Web-scale的推荐系统，M2GRL: A Multi-task Multi-view Graph Representation Learning Framework for Web-scale Recommender Systems

【KDD 2020】M2GRL: 一个多任务多视角图表示学习框架的Web-scale的推荐系统，M2GRL: A Multi-task Multi-view Graph Representation Learning Framework for Web-scale Recommender Systems

专知会员服务

29+阅读 · 2020年6月30日

【序列推荐系统:挑战、进展和展望】Sequential Recommender Systems

【序列推荐系统:挑战、进展和展望】Sequential Recommender Systems

专知会员服务

82+阅读 · 2020年4月25日

【干货书】实战推荐系统，Practical Recommender Systems，432页pdf

【干货书】实战推荐系统，Practical Recommender Systems，432页pdf

专知会员服务

180+阅读 · 2020年4月17日

【WWW2020】解决推荐系统中目标客户失真问题，Addressing the Target Customer Distortion Problem in Recommender Systems

【WWW2020】解决推荐系统中目标客户失真问题，Addressing the Target Customer Distortion Problem in Recommender Systems

专知会员服务

10+阅读 · 2020年4月4日

【AAAI2020教程】强化学习中的Exploration-Exploitation in Reinforcement Learning

专知会员服务

101+阅读 · 2020年2月8日

【论文推荐WWW2020-UIUC】修正排序系统中的选择偏差：Correcting for Selection Bias in Learning-to-rank Systems

【论文推荐WWW2020-UIUC】修正排序系统中的选择偏差：Correcting for Selection Bias in Learning-to-rank Systems

专知会员服务

32+阅读 · 2020年2月1日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【AAMSA 2019 | tutorial】多智能体系统中的认知推理Epistemic Reasoning In Multiagent Systems ,法国雷恩François Schwarzentruber

【AAMSA 2019 | tutorial】多智能体系统中的认知推理Epistemic Reasoning In Multiagent Systems ,法国雷恩François Schwarzentruber

专知会员服务

24+阅读 · 2019年5月14日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

《生态建模密码破译：建模与编程实践》美陆军最新报告

大模型解决方案白皮书：社交陪伴场景全流程落地指南

面向具身操作的视觉-语言-动作模型综述

相关资讯

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

LibRec 精选：位置感知的长序列会话推荐

LibRec 精选：位置感知的长序列会话推荐

LibRec智能推荐

3+阅读 · 2019年5月17日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec智能推荐

50+阅读 · 2018年8月27日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

推荐｜深度强化学习聊天机器人（附论文）！

推荐｜深度强化学习聊天机器人（附论文）！

全球人工智能

4+阅读 · 2018年1月30日

相关论文

D2RLIR : an improved and diversified ranking function in interactive recommendation systems based on deep reinforcement learning

D2RLIR : an improved and diversified ranking function in interactive recommendation systems based on deep reinforcement learning

Arxiv

0+阅读 · 2021年10月29日

A Survey on Reinforcement Learning for Recommender Systems

Arxiv

22+阅读 · 2021年9月22日

A Survey of Deep Reinforcement Learning in Recommender Systems: A Systematic Review and Future Directions

Arxiv

14+阅读 · 2021年9月8日

Learning Recommender Systems from Multi-Behavior Data

Learning Recommender Systems from Multi-Behavior Data

Arxiv

7+阅读 · 2018年11月29日

Deep Reinforcement Learning for Page-wise Recommendations

Arxiv

8+阅读 · 2018年5月7日

Human Interaction with Recommendation Systems

Arxiv

6+阅读 · 2018年3月28日

Learning Recommendations While Influencing Interests

Arxiv

9+阅读 · 2018年3月23日

Sequence-Aware Recommender Systems

Arxiv

8+阅读 · 2018年2月23日

Reinforcement Learning based Recommender System using Biclustering Technique

Arxiv

5+阅读 · 2018年1月17日

Deep Reinforcement Learning for List-wise Recommendations

Arxiv

13+阅读 · 2018年1月5日

微信扫码咨询专知VIP会员