通过政策改革解决多边劳工决定的长期影响 (Addressing the Long-term Impact of ML Decisions via Policy Regret) - 专知论文

会员服务 ·

0

ARM · ML · Bandits · INFORMS · 奖励函数 ·

2021 年 6 月 2 日

Addressing the Long-term Impact of ML Decisions via Policy Regret

翻译：通过政策改革解决多边劳工决定的长期影响

David Lindner,Hoda Heidari,Andreas Krause

from arxiv, Accepted to IJCAI 2021

Machine Learning (ML) increasingly informs the allocation of opportunities to individuals and communities in areas such as lending, education, employment, and beyond. Such decisions often impact their subjects' future characteristics and capabilities in an a priori unknown fashion. The decision-maker, therefore, faces exploration-exploitation dilemmas akin to those in multi-armed bandits. Following prior work, we model communities as arms. To capture the long-term effects of ML-based allocation decisions, we study a setting in which the reward from each arm evolves every time the decision-maker pulls that arm. We focus on reward functions that are initially increasing in the number of pulls but may become (and remain) decreasing after a certain point. We argue that an acceptable sequential allocation of opportunities must take an arm's potential for growth into account. We capture these considerations through the notion of policy regret, a much stronger notion than the often-studied external regret, and present an algorithm with provably sub-linear policy regret for sufficiently long time horizons. We empirically compare our algorithm with several baselines and find that it consistently outperforms them, in particular for long time horizons.

翻译：机器学习(ML)日益向个人和社区提供贷款、教育、就业等领域的机会,这种决定往往以不为人知的方式影响其主体未来的特点和能力。因此,决策者面临类似于多武装匪徒的探索-剥削困境。在以前的工作之后,我们以武器来模拟社区。要捕捉以ML为基础的分配决定的长期影响,我们研究每个手臂的奖赏每次决策者拉动手臂时都会演变的环境。我们注重奖赏功能,这些奖赏最初增加的拉动数量,但在某一点后可能会(和继续)减少。我们主张,可接受的连续分配机会必须考虑到一个手臂的增长潜力。我们通过政策遗憾的概念来捕捉这些考虑因素,这个概念比经常被研究的外部悔恨要强得多,并且提出一种在足够长的时间跨度上具有可辨知的子线政策悔恨的算法。我们从经验上将我们的算法与若干基线进行比较,发现它一贯超越这些基准,特别是在很长的时间跨度上。

0

相关内容

ARM

安谋控股公司，又称ARM公司，跨国性半导体设计与软件公司，总部位于英国英格兰剑桥。主要的产品是ARM架构处理器的设计，将其以知识产权的形式向客户进行授权，同时也提供软件开发工具。维基百科

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

已删除

将门创投

3+阅读 · 2018年11月20日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

On the Evolution of Subjective Experience

Arxiv

0+阅读 · 2021年7月23日

Decoupling Exploration and Exploitation in Reinforcement Learning

Arxiv

0+阅读 · 2021年7月22日

Engineering MultiQueues: Fast Relaxed Concurrent Priority Queues

Engineering MultiQueues: Fast Relaxed Concurrent Priority Queues

Arxiv

0+阅读 · 2021年7月22日

Assured Mission Adaptation of UAVs

Assured Mission Adaptation of UAVs

Arxiv

0+阅读 · 2021年7月21日

Multi-agent Reinforcement Learning Improvement in a Dynamic Environment Using Knowledge Transfer

Arxiv

0+阅读 · 2021年7月20日

Hypothetical Expected Utility

Arxiv

0+阅读 · 2021年7月20日

Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory

Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory

Arxiv

15+阅读 · 2020年12月15日

Residual Policy Learning

Residual Policy Learning

Arxiv

4+阅读 · 2018年12月15日

Contrastive Explanations for Reinforcement Learning in terms of Expected Consequences

Contrastive Explanations for Reinforcement Learning in terms of Expected Consequences

Arxiv

5+阅读 · 2018年7月23日

Visual Reinforcement Learning with Imagined Goals

Arxiv

8+阅读 · 2018年7月12日

VIP会员

文章信息

相关主题

相关VIP内容

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《多域空战指挥体系：驾驭复杂性的艺术》

构建军事人工智能信任体系始于破除黑盒机制

《生态建模密码破译：建模与编程实践》美陆军最新报告

《战争形态演变：合成兵种防御主导模式探析》48页slides

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

已删除

将门创投

3+阅读 · 2018年11月20日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

On the Evolution of Subjective Experience

Arxiv

0+阅读 · 2021年7月23日

Decoupling Exploration and Exploitation in Reinforcement Learning

Arxiv

0+阅读 · 2021年7月22日

Engineering MultiQueues: Fast Relaxed Concurrent Priority Queues

Engineering MultiQueues: Fast Relaxed Concurrent Priority Queues

Arxiv

0+阅读 · 2021年7月22日

Assured Mission Adaptation of UAVs

Assured Mission Adaptation of UAVs

Arxiv

0+阅读 · 2021年7月21日

Multi-agent Reinforcement Learning Improvement in a Dynamic Environment Using Knowledge Transfer

Arxiv

0+阅读 · 2021年7月20日

Hypothetical Expected Utility

Arxiv

0+阅读 · 2021年7月20日

Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory

Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory

Arxiv

15+阅读 · 2020年12月15日

Residual Policy Learning

Residual Policy Learning

Arxiv

4+阅读 · 2018年12月15日

Contrastive Explanations for Reinforcement Learning in terms of Expected Consequences

Contrastive Explanations for Reinforcement Learning in terms of Expected Consequences

Arxiv

5+阅读 · 2018年7月23日

Visual Reinforcement Learning with Imagined Goals

Arxiv

8+阅读 · 2018年7月12日

微信扫码咨询专知VIP会员