区分式私自线性二次曲线系统的适应性控制 (Adaptive Control of Differentially Private Linear Quadratic Systems) - 专知论文

会员服务 ·

0

线性的 · 控制器 · contrastive · Processing（编程语言） · 值域 ·

2021 年 8 月 26 日

Adaptive Control of Differentially Private Linear Quadratic Systems

翻译：区分式私自线性二次曲线系统的适应性控制

Sayak Ray Chowdhury,Xingyu Zhou,Ness Shroff

from arxiv, Accepted by ISIT 2021

In this paper, we study the problem of regret minimization in reinforcement learning (RL) under differential privacy constraints. This work is motivated by the wide range of RL applications for providing personalized service, where privacy concerns are becoming paramount. In contrast to previous works, we take the first step towards non-tabular RL settings, while providing a rigorous privacy guarantee. In particular, we consider the adaptive control of differentially private linear quadratic (LQ) systems. We develop the first private RL algorithm, PRL, which is able to attain a sub-linear regret while guaranteeing privacy protection. More importantly, the additional cost due to privacy is only on the order of $\frac{\ln(1/\delta)^{1/4}}{\epsilon^{1/2}}$ given privacy parameters $\epsilon, \delta > 0$. Through this process, we also provide a general procedure for adaptive control of LQ systems under changing regularizers, which not only generalizes previous non-private controls, but also serves as the basis for general private controls.

翻译：在本文中,我们研究了在不同的隐私限制下在强化学习(RL)中减少遗憾的问题;这项工作的动机是提供个性化服务的多种RL应用程序,其中隐私问题正在变得至关重要;与以前的工作相比,我们迈出了第一步,在提供严格的隐私保障的同时,向非薄性RL设置提供了严格的隐私保障;特别是,我们考虑了对有区别的私人线性线性二次曲线系统(LQ)的适应性控制;我们开发了第一个私人RL算法(PRL),该算法能够在保障隐私保护的同时实现子线性遗憾;更重要的是,由于隐私而增加的费用仅按$frac=ln(1/delta)\ ⁇ 1/4 ⁇ epsilon ⁇ 1/2 ⁇ $的顺序来计算,而给隐私参数为$\epsilon,\delta > 0$。我们通过这一过程,还为在改变的规范制度下对LQ系统的适应性控制提供了一般程序,这不仅概括了以前的非私人控制,而且还作为一般私人控制的基础。

0

相关内容

线性的

《算法凸几何》简明书，Algorithmic Convex Geometry，50页pdf

专知会员服务

42+阅读 · 2021年4月2日

最新《图理论》笔记书，98页pdf

最新《图理论》笔记书，98页pdf

专知会员服务

76+阅读 · 2020年12月27日

【Google】梯度下降，48页ppt

【Google】梯度下降，48页ppt

专知会员服务

81+阅读 · 2020年12月5日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

【KDD2020】CAST:一种基于相关关系的多尺度数据自适应光谱聚类算法,CAST: A Correlation-based Adaptive Spectral Clustering Algorithm on Multi-scale Data

【KDD2020】CAST:一种基于相关关系的多尺度数据自适应光谱聚类算法,CAST: A Correlation-based Adaptive Spectral Clustering Algorithm on Multi-scale Data

专知会员服务

20+阅读 · 2020年6月11日

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

专知会员服务

13+阅读 · 2020年6月8日

【Google-普林斯顿】从学习速率中解开自适应梯度法，Disentangling Adaptive Gradient

专知会员服务

19+阅读 · 2020年3月5日

【开放书】部分观测动态系统的贝叶斯学习，119页pdf，Bayesian Learning for partially observed dynamical systems

【开放书】部分观测动态系统的贝叶斯学习，119页pdf，Bayesian Learning for partially observed dynamical systems

专知会员服务

41+阅读 · 2019年12月27日

【电子书】现代大数据算法（Modern Big Data Algorithms）52页PDF免费下载

【电子书】现代大数据算法（Modern Big Data Algorithms）52页PDF免费下载

专知会员服务

23+阅读 · 2019年11月7日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

关关的刷题日记13——Leetcode 414. Third Maximum Number

关关的刷题日记13——Leetcode 414. Third Maximum Number

专知

3+阅读 · 2017年10月8日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Locally Differentially Private Reinforcement Learning for Linear Mixture Markov Decision Processes

Arxiv

0+阅读 · 2021年10月19日

Improved Algorithms for Misspecified Linear Markov Decision Processes

Arxiv

0+阅读 · 2021年10月19日

DPNAS: Neural Architecture Search for Deep Learning with Differential Privacy

Arxiv

4+阅读 · 2021年10月19日

Flexible Accuracy for Differential Privacy

Arxiv

0+阅读 · 2021年10月18日

Efficient quantum algorithm for dissipative nonlinear differential equations

Arxiv

0+阅读 · 2021年10月16日

Smoothness-Adaptive Contextual Bandits

Arxiv

0+阅读 · 2021年10月15日

Differentially Private Federated Learning via Inexact ADMM

Arxiv

0+阅读 · 2021年10月6日

Differentially Private Algorithms for Clustering with Stability Assumptions

Arxiv

0+阅读 · 2021年6月11日

Federated Learning with Sparsification-Amplified Privacy and Adaptive Optimization

Arxiv

2+阅读 · 2021年6月8日

LDP-FL: Practical Private Aggregation in Federated Learning with Local Differential Privacy

Arxiv

5+阅读 · 2020年7月31日

VIP会员

文章信息

相关主题

Processing（编程语言）

相关VIP内容

《算法凸几何》简明书，Algorithmic Convex Geometry，50页pdf

专知会员服务

42+阅读 · 2021年4月2日

最新《图理论》笔记书，98页pdf

最新《图理论》笔记书，98页pdf

专知会员服务

76+阅读 · 2020年12月27日

【Google】梯度下降，48页ppt

【Google】梯度下降，48页ppt

专知会员服务

81+阅读 · 2020年12月5日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

【KDD2020】CAST:一种基于相关关系的多尺度数据自适应光谱聚类算法,CAST: A Correlation-based Adaptive Spectral Clustering Algorithm on Multi-scale Data

【KDD2020】CAST:一种基于相关关系的多尺度数据自适应光谱聚类算法,CAST: A Correlation-based Adaptive Spectral Clustering Algorithm on Multi-scale Data

专知会员服务

20+阅读 · 2020年6月11日

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

专知会员服务

13+阅读 · 2020年6月8日

【Google-普林斯顿】从学习速率中解开自适应梯度法，Disentangling Adaptive Gradient

专知会员服务

19+阅读 · 2020年3月5日

【开放书】部分观测动态系统的贝叶斯学习，119页pdf，Bayesian Learning for partially observed dynamical systems

【开放书】部分观测动态系统的贝叶斯学习，119页pdf，Bayesian Learning for partially observed dynamical systems

专知会员服务

41+阅读 · 2019年12月27日

【电子书】现代大数据算法（Modern Big Data Algorithms）52页PDF免费下载

【电子书】现代大数据算法（Modern Big Data Algorithms）52页PDF免费下载

专知会员服务

23+阅读 · 2019年11月7日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

关关的刷题日记13——Leetcode 414. Third Maximum Number

关关的刷题日记13——Leetcode 414. Third Maximum Number

专知

3+阅读 · 2017年10月8日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

相关论文

Locally Differentially Private Reinforcement Learning for Linear Mixture Markov Decision Processes

Arxiv

0+阅读 · 2021年10月19日

Improved Algorithms for Misspecified Linear Markov Decision Processes

Arxiv

0+阅读 · 2021年10月19日

DPNAS: Neural Architecture Search for Deep Learning with Differential Privacy

Arxiv

4+阅读 · 2021年10月19日

Flexible Accuracy for Differential Privacy

Arxiv

0+阅读 · 2021年10月18日

Efficient quantum algorithm for dissipative nonlinear differential equations

Arxiv

0+阅读 · 2021年10月16日

Smoothness-Adaptive Contextual Bandits

Arxiv

0+阅读 · 2021年10月15日

Differentially Private Federated Learning via Inexact ADMM

Arxiv

0+阅读 · 2021年10月6日

Differentially Private Algorithms for Clustering with Stability Assumptions

Arxiv

0+阅读 · 2021年6月11日

Federated Learning with Sparsification-Amplified Privacy and Adaptive Optimization

Arxiv

2+阅读 · 2021年6月8日

LDP-FL: Practical Private Aggregation in Federated Learning with Local Differential Privacy

Arxiv

5+阅读 · 2020年7月31日

微信扫码咨询专知VIP会员