基于数据驱动的状态聚合方法：动态离散选择模型 (A Data-Driven State Aggregation Approach for Dynamic Discrete Choice Models) - 专知论文

会员服务 ·

0

离散 · 估计误差 · 最大似然估计 · 最大似然 · 数据驱动 ·

2023 年 4 月 11 日

A Data-Driven State Aggregation Approach for Dynamic Discrete Choice Models

翻译：基于数据驱动的状态聚合方法：动态离散选择模型

Sinong Geng,Houssam Nassif,Carlos A. Manzanares

We study dynamic discrete choice models, where a commonly studied problem involves estimating parameters of agent reward functions (also known as "structural" parameters), using agent behavioral data. Maximum likelihood estimation for such models requires dynamic programming, which is limited by the curse of dimensionality. In this work, we present a novel algorithm that provides a data-driven method for selecting and aggregating states, which lowers the computational and sample complexity of estimation. Our method works in two stages. In the first stage, we use a flexible inverse reinforcement learning approach to estimate agent Q-functions. We use these estimated Q-functions, along with a clustering algorithm, to select a subset of states that are the most pivotal for driving changes in Q-functions. In the second stage, with these selected "aggregated" states, we conduct maximum likelihood estimation using a commonly used nested fixed-point algorithm. The proposed two-stage approach mitigates the curse of dimensionality by reducing the problem dimension. Theoretically, we derive finite-sample bounds on the associated estimation error, which also characterize the trade-off of computational complexity, estimation error, and sample complexity. We demonstrate the empirical performance of the algorithm in two classic dynamic discrete choice estimation applications.

翻译：我们研究动态离散选择模型，其中一个常见的问题涉及利用代理行为数据来估计代理回报函数的参数（也称为“结构”参数）。对于这样的模型，最大似然估计需要动态规划，但这受到维度灾难的限制。在本文中，我们提出了一种新颖的算法，提供了一种数据驱动的方法来选择和聚合状态，降低了估计的计算和样本复杂性。我们的方法分为两个阶段。第一阶段，我们使用灵活的反向强化学习方法来估计代理 Q 函数。我们利用这些估计的 Q 函数，以及聚类算法，选择了一组对驱动 Q 函数变化最关键的状态。在第二阶段，利用这些选择的“聚合”状态，我们使用常用的巢式固定点算法进行最大似然估计。所提出的两阶段方法通过降低问题维度来缓解维度灾难。理论上，我们推导了与估计误差相关的有限样本界限，这也表征了计算复杂度、估计误差和样本复杂度的权衡。我们在两个经典的动态离散选择估计应用中展示了该算法的实证性能。

0

相关内容

【SIGMOD教程】高效数据标签的众包实践:聚合、增量重标签和定价，附180页slides

【SIGMOD教程】高效数据标签的众包实践:聚合、增量重标签和定价，附180页slides

专知会员服务

11+阅读 · 2022年10月20日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【新书】分布式强化学习，280页pdf

【新书】分布式强化学习，280页pdf

专知会员服务

160+阅读 · 2021年12月19日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【2020新书】数据科学与机器学习导论，220页pdf

【2020新书】数据科学与机器学习导论，220页pdf

专知会员服务

81+阅读 · 2020年9月14日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

专知会员服务

28+阅读 · 2020年2月12日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【SIGMOD2022教程】高效数据标签的众包实践:聚合、增量重标签和定价，附180页slides

【SIGMOD2022教程】高效数据标签的众包实践:聚合、增量重标签和定价，附180页slides

专知

0+阅读 · 2022年10月20日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【泡泡点云时空】基于增量分割的3D点云定位方法（ICRA2018-4）

【泡泡点云时空】基于增量分割的3D点云定位方法（ICRA2018-4）

泡泡机器人SLAM

13+阅读 · 2018年10月7日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

柔性工序选择的混合流水车间调度及其离散群智能算法研究

国家自然科学基金

0+阅读 · 2015年12月31日

关于面板(纵向）数据的动态统计分析

国家自然科学基金

0+阅读 · 2014年12月31日

高维数据下多因变量回归模型的统计推断

国家自然科学基金

5+阅读 · 2013年12月31日

Markov决策过程值函数逼近的基函数自动构造

国家自然科学基金

1+阅读 · 2012年12月31日

视觉角驾驶行为模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于类别非平衡时序增量数据批的多SVM动态集成企业信用评估建模

国家自然科学基金

1+阅读 · 2012年12月31日

随机模态驱动下动态过程贝叶斯递推估计

国家自然科学基金

0+阅读 · 2012年12月31日

Markov状态转换下的跳扩散风险理论的新模型与新算法

国家自然科学基金

1+阅读 · 2012年12月31日

基于数据驱动的智能人群监控分析与仿真研究

国家自然科学基金

2+阅读 · 2012年12月31日

基于纵向数据的秩回归和分位数回归的有效参数估计

国家自然科学基金

0+阅读 · 2012年12月31日

Provable and Practical: Efficient Exploration in Reinforcement Learning via Langevin Monte Carlo

Arxiv

0+阅读 · 2023年5月29日

Deep Progressive Feature Aggregation Network for High Dynamic Range Imaging

Arxiv

0+阅读 · 2023年5月29日

bqror: An R package for Bayesian Quantile Regression in Ordinal Models

Arxiv

0+阅读 · 2023年5月27日

A Model-Based Solution to the Offline Multi-Agent Reinforcement Learning Coordination Problem

Arxiv

0+阅读 · 2023年5月26日

D-CALM: A Dynamic Clustering-based Active Learning Approach for Mitigating Bias

Arxiv

0+阅读 · 2023年5月26日

Q-learning Decision Transformer: Leveraging Dynamic Programming for Conditional Sequence Modelling in Offline RL

Arxiv

0+阅读 · 2023年5月25日

Markov Decision Process with an External Temporal Process

Arxiv

0+阅读 · 2023年5月25日

The Benefits of Being Distributional: Small-Loss Bounds for Reinforcement Learning

Arxiv

0+阅读 · 2023年5月25日

Identification in Some Discrete Choice Models: A Computational Approach

Arxiv

0+阅读 · 2023年5月25日

Deep Learning-enabled MCMC for Probabilistic State Estimation in District Heating Grids

Arxiv

0+阅读 · 2023年5月24日

VIP会员

文章信息

相关主题

最大似然估计

相关VIP内容

【SIGMOD教程】高效数据标签的众包实践:聚合、增量重标签和定价，附180页slides

【SIGMOD教程】高效数据标签的众包实践:聚合、增量重标签和定价，附180页slides

专知会员服务

11+阅读 · 2022年10月20日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【新书】分布式强化学习，280页pdf

【新书】分布式强化学习，280页pdf

专知会员服务

160+阅读 · 2021年12月19日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【2020新书】数据科学与机器学习导论，220页pdf

【2020新书】数据科学与机器学习导论，220页pdf

专知会员服务

81+阅读 · 2020年9月14日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

专知会员服务

28+阅读 · 2020年2月12日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

《小型无人机系统侦测追踪技术：声学、计算机视觉与深度学习融合方案》最新98页

《"牧羊人网格"拦截策略：实现无人机集群可靠拦截的新范式》

光纤无人机：反无人机系统的重大挑战

《作战建模与仿真实证研究》

相关资讯

【SIGMOD2022教程】高效数据标签的众包实践:聚合、增量重标签和定价，附180页slides

【SIGMOD2022教程】高效数据标签的众包实践:聚合、增量重标签和定价，附180页slides

专知

0+阅读 · 2022年10月20日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【泡泡点云时空】基于增量分割的3D点云定位方法（ICRA2018-4）

【泡泡点云时空】基于增量分割的3D点云定位方法（ICRA2018-4）

泡泡机器人SLAM

13+阅读 · 2018年10月7日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Provable and Practical: Efficient Exploration in Reinforcement Learning via Langevin Monte Carlo

Arxiv

0+阅读 · 2023年5月29日

Deep Progressive Feature Aggregation Network for High Dynamic Range Imaging

Arxiv

0+阅读 · 2023年5月29日

bqror: An R package for Bayesian Quantile Regression in Ordinal Models

Arxiv

0+阅读 · 2023年5月27日

A Model-Based Solution to the Offline Multi-Agent Reinforcement Learning Coordination Problem

Arxiv

0+阅读 · 2023年5月26日

D-CALM: A Dynamic Clustering-based Active Learning Approach for Mitigating Bias

Arxiv

0+阅读 · 2023年5月26日

Q-learning Decision Transformer: Leveraging Dynamic Programming for Conditional Sequence Modelling in Offline RL

Arxiv

0+阅读 · 2023年5月25日

Markov Decision Process with an External Temporal Process

Arxiv

0+阅读 · 2023年5月25日

The Benefits of Being Distributional: Small-Loss Bounds for Reinforcement Learning

Arxiv

0+阅读 · 2023年5月25日

Identification in Some Discrete Choice Models: A Computational Approach

Arxiv

0+阅读 · 2023年5月25日

Deep Learning-enabled MCMC for Probabilistic State Estimation in District Heating Grids

Arxiv

0+阅读 · 2023年5月24日

相关基金

柔性工序选择的混合流水车间调度及其离散群智能算法研究

国家自然科学基金

0+阅读 · 2015年12月31日

关于面板(纵向）数据的动态统计分析

国家自然科学基金

0+阅读 · 2014年12月31日

高维数据下多因变量回归模型的统计推断

国家自然科学基金

5+阅读 · 2013年12月31日

Markov决策过程值函数逼近的基函数自动构造

国家自然科学基金

1+阅读 · 2012年12月31日

视觉角驾驶行为模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于类别非平衡时序增量数据批的多SVM动态集成企业信用评估建模

国家自然科学基金

1+阅读 · 2012年12月31日

随机模态驱动下动态过程贝叶斯递推估计

国家自然科学基金

0+阅读 · 2012年12月31日

Markov状态转换下的跳扩散风险理论的新模型与新算法

国家自然科学基金

1+阅读 · 2012年12月31日

基于数据驱动的智能人群监控分析与仿真研究

国家自然科学基金

2+阅读 · 2012年12月31日

基于纵向数据的秩回归和分位数回归的有效参数估计

国家自然科学基金

0+阅读 · 2012年12月31日

微信扫码咨询专知VIP会员