基于Bandit算法的推荐应用测试：理解多臂赌博机中关于人类偏好假设的有效性 (A Field Test of Bandit Algorithms for Recommendations: Understanding the Validity of Assumptions on Human Preferences in Multi-armed Bandits) - 专知论文

会员服务 ·

0

赌博机/老虎机 · 多臂赌博机 · 赌博机 · 启发式 · 算法 ·

2023 年 4 月 16 日

A Field Test of Bandit Algorithms for Recommendations: Understanding the Validity of Assumptions on Human Preferences in Multi-armed Bandits

翻译：基于Bandit算法的推荐应用测试：理解多臂赌博机中关于人类偏好假设的有效性

Liu Leqi,Giulio Zhou,Fatma Kılınç-Karzan,Zachary C. Lipton,Alan L. Montgomery

from arxiv, Accepted to CHI. 16 pages, 6 figures

Personalized recommender systems suffuse modern life, shaping what media we read and what products we consume. Algorithms powering such systems tend to consist of supervised learning-based heuristics, such as latent factor models with a variety of heuristically chosen prediction targets. Meanwhile, theoretical treatments of recommendation frequently address the decision-theoretic nature of the problem, including the need to balance exploration and exploitation, via the multi-armed bandits (MABs) framework. However, MAB-based approaches rely heavily on assumptions about human preferences. These preference assumptions are seldom tested using human subject studies, partly due to the lack of publicly available toolkits to conduct such studies. In this work, we conduct a study with crowdworkers in a comics recommendation MABs setting. Each arm represents a comic category, and users provide feedback after each recommendation. We check the validity of core MABs assumptions-that human preferences (reward distributions) are fixed over time-and find that they do not hold. This finding suggests that any MAB algorithm used for recommender systems should account for human preference dynamics. While answering these questions, we provide a flexible experimental framework for understanding human preference dynamics and testing MABs algorithms with human users. The code for our experimental framework and the collected data can be found at https://github.com/HumainLab/human-bandit-evaluation.

翻译：个性化推荐系统已经渗透到现代生活中，影响着我们读什么媒体、消费什么产品。推荐算法往往由基于监督学习的启发式算法构成，例如具有各种启发式选择预测目标的潜在因子模型。同时，关于推荐的理论研究通常也涉及问题的决策理论性质，包括通过多臂赌博机（MABs）框架平衡勘探和开发的需要。然而，基于MAB的方法在很大程度上依赖于人类偏好的假设。这些偏好假设很少通过人类主体研究进行验证，部分原因是缺乏进行此类研究的公共工具包。在这项工作中，我们对漫画推荐MAB的场景下进行了与众包工人的研究。每个摇臂代表一种漫画类别，用户在每次推荐后进行反馈。我们检查了MAB的核心假设——人类偏好（奖励分布）随时间固定——并发现它们并不成立。该发现表明，用于推荐系统的任何MAB算法都应考虑人类偏好动态。在回答这些问题的同时，我们提供了一个灵活的实验框架，用于理解人类偏好动态并测试带有人类用户评估的MAB算法。我们的实验框架的代码和收集数据可在 https://github.com/HumainLab/human-bandit-evaluation。

0

相关内容

赌博机/老虎机

赌博机/老虎机

【干货书】工程和科学中的概率和统计，

【干货书】工程和科学中的概率和统计，

专知会员服务

58+阅读 · 2022年12月24日

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

专知会员服务

23+阅读 · 2022年3月19日

如何使用TensorFlow 排序构建推荐系统? How to build a recommendation system using TensorFlow Ranking?

如何使用TensorFlow 排序构建推荐系统? How to build a recommendation system using TensorFlow Ranking?

专知会员服务

19+阅读 · 2022年3月13日

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

专知会员服务

13+阅读 · 2020年6月8日

【干货书】实战推荐系统，Practical Recommender Systems，432页pdf

【干货书】实战推荐系统，Practical Recommender Systems，432页pdf

专知会员服务

180+阅读 · 2020年4月17日

对话推荐系统综述论文，35页pdf，A Survey on Conversational Recommender Systems

对话推荐系统综述论文，35页pdf，A Survey on Conversational Recommender Systems

专知会员服务

116+阅读 · 2020年4月3日

【2020密歇根大学论文】基于学习的序列决策算法的公平性综述论文，Fairness in Learning-Based Sequential Decision Algorithms: A Survey

【2020密歇根大学论文】基于学习的序列决策算法的公平性综述论文，Fairness in Learning-Based Sequential Decision Algorithms: A Survey

专知会员服务

22+阅读 · 2020年1月15日

【NLP| 推荐文章】基于知识库的问答系统关键技术综述（Core techniques of question answering systems over knowledge bases：a survey）

专知会员服务

47+阅读 · 2019年11月24日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

量化金融强化学习论文集合

量化金融强化学习论文集合

专知

14+阅读 · 2019年12月18日

LibRec 精选：推荐系统的常用数据集

LibRec 精选：推荐系统的常用数据集

LibRec智能推荐

17+阅读 · 2019年2月15日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

专知

19+阅读 · 2018年6月26日

【论文推荐】最新八篇推荐系统相关论文—可解释推荐、上下文感知推荐系统、异构知识库嵌入、深度强化学习、移动推荐系统

【论文推荐】最新八篇推荐系统相关论文—可解释推荐、上下文感知推荐系统、异构知识库嵌入、深度强化学习、移动推荐系统

专知

17+阅读 · 2018年6月16日

LibRec 精选：推荐的可解释性[综述]

LibRec 精选：推荐的可解释性[综述]

LibRec智能推荐

10+阅读 · 2018年5月4日

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

专知

17+阅读 · 2018年4月28日

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

专知

10+阅读 · 2018年4月8日

【论文推荐】最新7篇聊天机器人（Chatbot）相关论文—触动你的心、DeepProbe、饮食推荐、知识学习、交互、挑战、管理

【论文推荐】最新7篇聊天机器人（Chatbot）相关论文—触动你的心、DeepProbe、饮食推荐、知识学习、交互、挑战、管理

专知

12+阅读 · 2018年3月15日

【论文推荐】最新六篇生成式对抗网络（GAN）相关论文—半监督学习、对偶、交互生成对抗网络、激活、纳什均衡、tempoGAN

【论文推荐】最新六篇生成式对抗网络（GAN）相关论文—半监督学习、对偶、交互生成对抗网络、激活、纳什均衡、tempoGAN

专知

23+阅读 · 2018年2月23日

信息不完全的双边匹配决策方法研究

国家自然科学基金

3+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

非均质量子器件Schr？dinger-Poisson系统多尺度分析与算法研究

国家自然科学基金

0+阅读 · 2014年12月31日

GLP-1对阿尔茨海默病生物节律紊乱的调节研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于CBD的项目风险应对决策理论与方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

测试一阶逻辑可定义图性质

国家自然科学基金

1+阅读 · 2013年12月31日

基于生态理性的跨期环境风险认知的主观概率和效用研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于光子时间间隔统计的激光相干频率分析新模型

国家自然科学基金

0+阅读 · 2012年12月31日

基于Multi-Agent的应急状态下协同供应链数据集成研究

国家自然科学基金

0+阅读 · 2012年12月31日

图的双临猜想及相关的着色问题

国家自然科学基金

0+阅读 · 2011年12月31日

Enhanced Gaussian Process Dynamical Models with Knowledge Transfer for Long-term Battery Degradation Forecasting

Arxiv

0+阅读 · 2023年6月2日

PrefRec: Recommender Systems with Human Preferences for Reinforcing Long-term User Engagement

Arxiv

0+阅读 · 2023年6月2日

Supply-Side Equilibria in Recommender Systems

Arxiv

0+阅读 · 2023年6月2日

A Multi-Modal Latent-Features based Service Recommendation System for the Social Internet of Things

Arxiv

0+阅读 · 2023年6月1日

chemSKI with tokens: world building and economy in the SKI universe

Arxiv

0+阅读 · 2023年6月1日

Efficient Failure Pattern Identification of Predictive Algorithms

Arxiv

0+阅读 · 2023年6月1日

MedNgage: A Dataset for Understanding Engagement in Patient-Nurse Conversations

Arxiv

0+阅读 · 2023年5月31日

A Survey on Reinforcement Learning for Recommender Systems

Arxiv

22+阅读 · 2021年9月22日

Controllable Multi-Interest Framework for Recommendation

Arxiv

18+阅读 · 2020年8月3日

Deep Reinforcement Learning for List-wise Recommendations

Arxiv

13+阅读 · 2018年1月5日

VIP会员

文章信息

相关主题

赌博机/老虎机

多臂赌博机

相关VIP内容

【干货书】工程和科学中的概率和统计，

【干货书】工程和科学中的概率和统计，

专知会员服务

58+阅读 · 2022年12月24日

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

专知会员服务

23+阅读 · 2022年3月19日

如何使用TensorFlow 排序构建推荐系统? How to build a recommendation system using TensorFlow Ranking?

如何使用TensorFlow 排序构建推荐系统? How to build a recommendation system using TensorFlow Ranking?

专知会员服务

19+阅读 · 2022年3月13日

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

专知会员服务

13+阅读 · 2020年6月8日

【干货书】实战推荐系统，Practical Recommender Systems，432页pdf

【干货书】实战推荐系统，Practical Recommender Systems，432页pdf

专知会员服务

180+阅读 · 2020年4月17日

对话推荐系统综述论文，35页pdf，A Survey on Conversational Recommender Systems

对话推荐系统综述论文，35页pdf，A Survey on Conversational Recommender Systems

专知会员服务

116+阅读 · 2020年4月3日

【2020密歇根大学论文】基于学习的序列决策算法的公平性综述论文，Fairness in Learning-Based Sequential Decision Algorithms: A Survey

【2020密歇根大学论文】基于学习的序列决策算法的公平性综述论文，Fairness in Learning-Based Sequential Decision Algorithms: A Survey

专知会员服务

22+阅读 · 2020年1月15日

【NLP| 推荐文章】基于知识库的问答系统关键技术综述（Core techniques of question answering systems over knowledge bases：a survey）

专知会员服务

47+阅读 · 2019年11月24日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

热门VIP内容

开通专知VIP会员享更多权益服务

从社会学实验到行为仿真：理解基于Agent的观点动力学建模思维

中英文版《GPT-5 System Card速览》报告

ACL 2025 | 大模型结构化知识提示的泛化能力研究

【普林斯顿博士论文】大型模型的高效推理

相关资讯

量化金融强化学习论文集合

量化金融强化学习论文集合

专知

14+阅读 · 2019年12月18日

LibRec 精选：推荐系统的常用数据集

LibRec 精选：推荐系统的常用数据集

LibRec智能推荐

17+阅读 · 2019年2月15日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

专知

19+阅读 · 2018年6月26日

【论文推荐】最新八篇推荐系统相关论文—可解释推荐、上下文感知推荐系统、异构知识库嵌入、深度强化学习、移动推荐系统

【论文推荐】最新八篇推荐系统相关论文—可解释推荐、上下文感知推荐系统、异构知识库嵌入、深度强化学习、移动推荐系统

专知

17+阅读 · 2018年6月16日

LibRec 精选：推荐的可解释性[综述]

LibRec 精选：推荐的可解释性[综述]

LibRec智能推荐

10+阅读 · 2018年5月4日

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

专知

17+阅读 · 2018年4月28日

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

专知

10+阅读 · 2018年4月8日

【论文推荐】最新7篇聊天机器人（Chatbot）相关论文—触动你的心、DeepProbe、饮食推荐、知识学习、交互、挑战、管理

【论文推荐】最新7篇聊天机器人（Chatbot）相关论文—触动你的心、DeepProbe、饮食推荐、知识学习、交互、挑战、管理

专知

12+阅读 · 2018年3月15日

【论文推荐】最新六篇生成式对抗网络（GAN）相关论文—半监督学习、对偶、交互生成对抗网络、激活、纳什均衡、tempoGAN

【论文推荐】最新六篇生成式对抗网络（GAN）相关论文—半监督学习、对偶、交互生成对抗网络、激活、纳什均衡、tempoGAN

专知

23+阅读 · 2018年2月23日

相关论文

Enhanced Gaussian Process Dynamical Models with Knowledge Transfer for Long-term Battery Degradation Forecasting

Arxiv

0+阅读 · 2023年6月2日

PrefRec: Recommender Systems with Human Preferences for Reinforcing Long-term User Engagement

Arxiv

0+阅读 · 2023年6月2日

Supply-Side Equilibria in Recommender Systems

Arxiv

0+阅读 · 2023年6月2日

A Multi-Modal Latent-Features based Service Recommendation System for the Social Internet of Things

Arxiv

0+阅读 · 2023年6月1日

chemSKI with tokens: world building and economy in the SKI universe

Arxiv

0+阅读 · 2023年6月1日

Efficient Failure Pattern Identification of Predictive Algorithms

Arxiv

0+阅读 · 2023年6月1日

MedNgage: A Dataset for Understanding Engagement in Patient-Nurse Conversations

Arxiv

0+阅读 · 2023年5月31日

A Survey on Reinforcement Learning for Recommender Systems

Arxiv

22+阅读 · 2021年9月22日

Controllable Multi-Interest Framework for Recommendation

Arxiv

18+阅读 · 2020年8月3日

Deep Reinforcement Learning for List-wise Recommendations

Arxiv

13+阅读 · 2018年1月5日

相关基金

信息不完全的双边匹配决策方法研究

国家自然科学基金

3+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

非均质量子器件Schr？dinger-Poisson系统多尺度分析与算法研究

国家自然科学基金

0+阅读 · 2014年12月31日

GLP-1对阿尔茨海默病生物节律紊乱的调节研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于CBD的项目风险应对决策理论与方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

测试一阶逻辑可定义图性质

国家自然科学基金

1+阅读 · 2013年12月31日

基于生态理性的跨期环境风险认知的主观概率和效用研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于光子时间间隔统计的激光相干频率分析新模型

国家自然科学基金

0+阅读 · 2012年12月31日

基于Multi-Agent的应急状态下协同供应链数据集成研究

国家自然科学基金

0+阅读 · 2012年12月31日

图的双临猜想及相关的着色问题

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员