个性化决策支持策略的学习 (Learning Personalized Decision Support Policies) - 专知论文

会员服务 ·

0

决策支持 · 在线算法 · 用户决策 · 赌博机 · 目标优化 ·

2023 年 4 月 13 日

Learning Personalized Decision Support Policies

翻译：个性化决策支持策略的学习

Umang Bhatt,Valerie Chen,Katherine M. Collins,Parameswaran Kamalaruban,Emma Kallina,Adrian Weller,Ameet Talwalkar

from arxiv, Working paper

Individual human decision-makers may benefit from different forms of support to improve decision outcomes. However, a key question is which form of support will lead to accurate decisions at a low cost. In this work, we propose learning a decision support policy that, for a given input, chooses which form of support, if any, to provide. We consider decision-makers for whom we have no prior information and formalize learning their respective policies as a multi-objective optimization problem that trades off accuracy and cost. Using techniques from stochastic contextual bandits, we propose $\texttt{THREAD}$, an online algorithm to personalize a decision support policy for each decision-maker, and devise a hyper-parameter tuning strategy to identify a cost-performance trade-off using simulated human behavior. We provide computational experiments to demonstrate the benefits of $\texttt{THREAD}$ compared to offline baselines. We then introduce $\texttt{Modiste}$, an interactive tool that provides $\texttt{THREAD}$ with an interface. We conduct human subject experiments to show how $\texttt{Modiste}$ learns policies personalized to each decision-maker and discuss the nuances of learning decision support policies online for real users.

翻译：个体人类决策者可能会受益于不同形式的决策支持以提高决策成果。但是，一个关键问题是哪种形式的支持将导致低成本的准确决策。在这项工作中，我们提出了学习决策支持策略的方法，该策略针对给定的输入选择是否提供支持。我们考虑了我们没有先前信息的决策者，并将其各自的策略学习形式化为一个多目标优化问题，以平衡准确性和成本。使用随机上下文型赌博机的技术，我们提出了一种在线算法 THREAD，用于个性化决策支持策略，设计了一个超参数调整策略，以使用模拟人类行为识别成本 - 性能权衡。我们提供计算实验来展示 THREAD 相对于离线基线的好处。然后介绍一个交互式工具 Modiste，为 THREAD 提供一个界面。我们进行人类主体实验，展示了如何学习针对每个决策者的个性化策略，并讨论在线学习真实用户决策支持策略的细微差别。

2

相关内容

决策支持

【IJCAI2022教程】对话推荐系统，Conversational Recommender Systems

【IJCAI2022教程】对话推荐系统，Conversational Recommender Systems

专知会员服务

34+阅读 · 2022年7月28日

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

【2022新书】强化学习工业应用，408页pdf

【2022新书】强化学习工业应用，408页pdf

专知会员服务

231+阅读 · 2022年2月3日

【KDD2020】具有条件公平性的算法决策，Algorithmic Decision Making with Conditional Fairness

【KDD2020】具有条件公平性的算法决策，Algorithmic Decision Making with Conditional Fairness

专知会员服务

22+阅读 · 2020年6月19日

【SIGIR2020】策略感知的无偏排序学习—Top-K排序，Policy-Aware Unbiased Learning to Rank for Top-𝑘 Rankings

【SIGIR2020】策略感知的无偏排序学习—Top-K排序，Policy-Aware Unbiased Learning to Rank for Top-𝑘 Rankings

专知会员服务

27+阅读 · 2020年6月10日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec智能推荐

50+阅读 · 2018年8月27日

【论文推荐】最新7篇聊天机器人（Chatbot）相关论文—触动你的心、DeepProbe、饮食推荐、知识学习、交互、挑战、管理

【论文推荐】最新7篇聊天机器人（Chatbot）相关论文—触动你的心、DeepProbe、饮食推荐、知识学习、交互、挑战、管理

专知

12+阅读 · 2018年3月15日

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

专知

17+阅读 · 2018年2月11日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

44+阅读 · 2015年12月31日

基于客户环境偏好的企业绿色供应链管理决策模型与方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

多轴框架机构静不平衡力矩一体化高精度测量与自动补偿研究

国家自然科学基金

0+阅读 · 2014年12月31日

多元多维偏好组合集结的群体决策与多目标优化方法

国家自然科学基金

1+阅读 · 2014年12月31日

关键词广告中的最优广告策略研究

国家自然科学基金

3+阅读 · 2014年12月31日

基于Petri网与协同过滤的云上Web服务可信性量化分析与预测的研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于Universum学习的降维方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

电子商务中定向广告精准性优化模型和方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

社会网络中个性化隐私保护研究

国家自然科学基金

2+阅读 · 2011年12月31日

基于"非监督-监督-激励"集成学习模式的机器人行为自主学习系统研究

国家自然科学基金

1+阅读 · 2010年12月31日

K-SHAP: Policy Clustering Algorithm for Anonymous State-Action Pairs

Arxiv

0+阅读 · 2023年5月31日

A Human-in-the-Loop Approach for Information Extraction from Privacy Policies under Data Scarcity

Arxiv

0+阅读 · 2023年5月31日

Prediction Error-based Classification for Class-Incremental Learning

Arxiv

0+阅读 · 2023年5月30日

Iterative Forward Tuning Boosts In-context Learning in Language Models

Arxiv

0+阅读 · 2023年5月30日

Fast Offline Policy Optimization for Large Scale Recommendation

Arxiv

0+阅读 · 2023年5月27日

A Mechanism for Sample-Efficient In-Context Learning for Sparse Retrieval Tasks

Arxiv

0+阅读 · 2023年5月26日

A New Policy Iteration Algorithm For Reinforcement Learning in Zero-Sum Markov Games

Arxiv

0+阅读 · 2023年5月25日

Knowledge Augmented Machine Learning with Applications in Autonomous Driving: A Survey

Arxiv

17+阅读 · 2022年5月10日

Financial Time Series Representation Learning

Financial Time Series Representation Learning

Arxiv

10+阅读 · 2020年3月27日

Zero-Shot Transfer Learning for Event Extraction

Arxiv

10+阅读 · 2017年7月4日

VIP会员

文章信息

相关主题

相关VIP内容

【IJCAI2022教程】对话推荐系统，Conversational Recommender Systems

【IJCAI2022教程】对话推荐系统，Conversational Recommender Systems

专知会员服务

34+阅读 · 2022年7月28日

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

【2022新书】强化学习工业应用，408页pdf

【2022新书】强化学习工业应用，408页pdf

专知会员服务

231+阅读 · 2022年2月3日

【KDD2020】具有条件公平性的算法决策，Algorithmic Decision Making with Conditional Fairness

【KDD2020】具有条件公平性的算法决策，Algorithmic Decision Making with Conditional Fairness

专知会员服务

22+阅读 · 2020年6月19日

【SIGIR2020】策略感知的无偏排序学习—Top-K排序，Policy-Aware Unbiased Learning to Rank for Top-𝑘 Rankings

【SIGIR2020】策略感知的无偏排序学习—Top-K排序，Policy-Aware Unbiased Learning to Rank for Top-𝑘 Rankings

专知会员服务

27+阅读 · 2020年6月10日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

自主人工智能：未来战争是否将是自主化的？

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec智能推荐

50+阅读 · 2018年8月27日

【论文推荐】最新7篇聊天机器人（Chatbot）相关论文—触动你的心、DeepProbe、饮食推荐、知识学习、交互、挑战、管理

【论文推荐】最新7篇聊天机器人（Chatbot）相关论文—触动你的心、DeepProbe、饮食推荐、知识学习、交互、挑战、管理

专知

12+阅读 · 2018年3月15日

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

专知

17+阅读 · 2018年2月11日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

相关论文

K-SHAP: Policy Clustering Algorithm for Anonymous State-Action Pairs

Arxiv

0+阅读 · 2023年5月31日

A Human-in-the-Loop Approach for Information Extraction from Privacy Policies under Data Scarcity

Arxiv

0+阅读 · 2023年5月31日

Prediction Error-based Classification for Class-Incremental Learning

Arxiv

0+阅读 · 2023年5月30日

Iterative Forward Tuning Boosts In-context Learning in Language Models

Arxiv

0+阅读 · 2023年5月30日

Fast Offline Policy Optimization for Large Scale Recommendation

Arxiv

0+阅读 · 2023年5月27日

A Mechanism for Sample-Efficient In-Context Learning for Sparse Retrieval Tasks

Arxiv

0+阅读 · 2023年5月26日

A New Policy Iteration Algorithm For Reinforcement Learning in Zero-Sum Markov Games

Arxiv

0+阅读 · 2023年5月25日

Knowledge Augmented Machine Learning with Applications in Autonomous Driving: A Survey

Arxiv

17+阅读 · 2022年5月10日

Financial Time Series Representation Learning

Financial Time Series Representation Learning

Arxiv

10+阅读 · 2020年3月27日

Zero-Shot Transfer Learning for Event Extraction

Arxiv

10+阅读 · 2017年7月4日

相关基金

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

44+阅读 · 2015年12月31日

基于客户环境偏好的企业绿色供应链管理决策模型与方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

多轴框架机构静不平衡力矩一体化高精度测量与自动补偿研究

国家自然科学基金

0+阅读 · 2014年12月31日

多元多维偏好组合集结的群体决策与多目标优化方法

国家自然科学基金

1+阅读 · 2014年12月31日

关键词广告中的最优广告策略研究

国家自然科学基金

3+阅读 · 2014年12月31日

基于Petri网与协同过滤的云上Web服务可信性量化分析与预测的研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于Universum学习的降维方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

电子商务中定向广告精准性优化模型和方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

社会网络中个性化隐私保护研究

国家自然科学基金

2+阅读 · 2011年12月31日

基于"非监督-监督-激励"集成学习模式的机器人行为自主学习系统研究

国家自然科学基金

1+阅读 · 2010年12月31日

微信扫码咨询专知VIP会员