通过ADMMM(ADMM)的可被证实的一致信息瓶颈解决方案 (A Provably Convergent Information Bottleneck Solution via ADMM) - 专知论文

会员服务 ·

0

INFORMS · 增广拉格朗日法 · 预测准确率 · 学成 · 模型评估 ·

2021 年 2 月 9 日

A Provably Convergent Information Bottleneck Solution via ADMM

翻译：通过ADMMM(ADMM)的可被证实的一致信息瓶颈解决方案

Teng-Hui Huang,Aly El Gamal

from arxiv, 8 pages, 2 figures

The Information bottleneck (IB) method enables optimizing over the trade-off between compression of data and prediction accuracy of learned representations, and has successfully and robustly been applied to both supervised and unsupervised representation learning problems. However, IB has several limitations. First, the IB problem is hard to optimize. The IB Lagrangian $\mathcal{L}_{IB}:=I(X;Z)-\beta I(Y;Z)$ is non-convex and existing solutions guarantee only local convergence. As a result, the obtained solutions depend on initialization. Second, the evaluation of a solution is also a challenging task. Conventionally, it resorts to characterizing the information plane, that is, plotting $I(Y;Z)$ versus $I(X;Z)$ for all solutions obtained from different initial points. Furthermore, the IB Lagrangian has phase transitions while varying the multiplier $\beta$. At phase transitions, both $I(X;Z)$ and $I(Y;Z)$ increase abruptly and the rate of convergence becomes significantly slow for existing solutions. Recent works with IB adopt variational surrogate bounds to the IB Lagrangian. Although allowing efficient optimization, how close are these surrogates to the IB Lagrangian is not clear. In this work, we solve the IB Lagrangian using augmented Lagrangian methods. With augmented variables, we show that the IB objective can be solved with the alternating direction method of multipliers (ADMM). Different from prior works, we prove that the proposed algorithm is consistently convergent, regardless of the value of $\beta$. Empirically, our gradient-descent-based method results in information plane points that are denser and comparable to those obtained through the conventional Blahut-Arimoto-based solvers.

翻译：信息瓶颈( IB) 方法能够优化数据压缩和所学表现的预测准确性之间的权衡, 并且成功和有力地应用到受监管和不受监管的演示学习问题。但是, IB 有几个限制。首先, IB 问题很难优化。 IB Lagrangian $\ mathcal{L ⁇ IB} : = I( X; Z)\beta I( Y; Z) 美元是非默认的, 现有解决方案只能保证本地的趋同。因此, 获得的解决方案取决于初始化。其次, 对解决方案的评估也是一项具有挑战性的任务。公约性, IB 使用信息平面图, 即绘制美元( Y; Z) 相对于美元( 美元) 美元( 美元), I( ) 平面值( ) 和美元( 美元) 。在阶段过渡中, 变异端法( ) 和变法( Y; 递增 ) 和美元( Z) 趋近点, 我们的趋近点( B) 渐渐变法。虽然最近的工作进展法法能让I 渐渐渐渐渐渐渐渐渐渐变。

0

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

专知会员服务

122+阅读 · 2020年5月30日

【ICLR2020】深度神经网络优化轨迹的平衡点，The Break-Even Point on Optimization Trajectories of Deep Neural Networks

【ICLR2020】深度神经网络优化轨迹的平衡点，The Break-Even Point on Optimization Trajectories of Deep Neural Networks

专知会员服务

34+阅读 · 2020年2月27日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

TCN v2 + 3Dconv 运动信息

TCN v2 + 3Dconv 运动信息

CreateAMind

4+阅读 · 2019年1月8日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

【泡泡一分钟】基于视觉传感器的三维空间几何重建（3dv-16）

【泡泡一分钟】基于视觉传感器的三维空间几何重建（3dv-16）

泡泡机器人SLAM

4+阅读 · 2017年12月18日

NIPS 2017：贝叶斯深度学习与深度贝叶斯学习（讲义+视频）

NIPS 2017：贝叶斯深度学习与深度贝叶斯学习（讲义+视频）

机器学习研究会

36+阅读 · 2017年12月10日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Improved Max-value Entropy Search for Multi-objective Bayesian Optimization with Constraints

Arxiv

0+阅读 · 2021年4月2日

Delay-Tolerant Consensus-based Distributed Estimation: Full-Rank Systems with Potentially Unstable Dynamics

Arxiv

0+阅读 · 2021年4月1日

Decentralized and Model-Free Federated Learning: Consensus-Based Distillation in Function Space

Arxiv

0+阅读 · 2021年4月1日

On the Convergence Time of Federated Learning Over Wireless Networks Under Imperfect CSI

Arxiv

0+阅读 · 2021年4月1日

Robust Experimentation in the Continuous Time Bandit Problem

Arxiv

0+阅读 · 2021年3月31日

Generalized Conflict-directed Search for Optimal Ordering Problems

Arxiv

0+阅读 · 2021年3月31日

Contractivity of Runge-Kutta methods for convex gradient systems

Arxiv

0+阅读 · 2021年3月31日

Revisiting Bayesian Optimization in the light of the COCO benchmark

Arxiv

0+阅读 · 2021年3月30日

Disentangled Information Bottleneck

Disentangled Information Bottleneck

Arxiv

12+阅读 · 2020年12月22日

Learning Optimal Representations with the Decodable Information Bottleneck

Arxiv

6+阅读 · 2020年9月27日

VIP会员

文章信息

相关主题

增广拉格朗日法

预测准确率

相关VIP内容

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

专知会员服务

122+阅读 · 2020年5月30日

【ICLR2020】深度神经网络优化轨迹的平衡点，The Break-Even Point on Optimization Trajectories of Deep Neural Networks

【ICLR2020】深度神经网络优化轨迹的平衡点，The Break-Even Point on Optimization Trajectories of Deep Neural Networks

专知会员服务

34+阅读 · 2020年2月27日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《基于大型语言模型的软件工程自动化研究》最新264页

《基于大型语言模型的信号处理管线研究：推进军事电子情报工作流程》最新76页

中文版 | 战争算法：生成式人工智能在战场的崛起

中文版《美国陆军：战术行为性远程医疗实施观察与建议》

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

TCN v2 + 3Dconv 运动信息

TCN v2 + 3Dconv 运动信息

CreateAMind

4+阅读 · 2019年1月8日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

【泡泡一分钟】基于视觉传感器的三维空间几何重建（3dv-16）

【泡泡一分钟】基于视觉传感器的三维空间几何重建（3dv-16）

泡泡机器人SLAM

4+阅读 · 2017年12月18日

NIPS 2017：贝叶斯深度学习与深度贝叶斯学习（讲义+视频）

NIPS 2017：贝叶斯深度学习与深度贝叶斯学习（讲义+视频）

机器学习研究会

36+阅读 · 2017年12月10日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Improved Max-value Entropy Search for Multi-objective Bayesian Optimization with Constraints

Arxiv

0+阅读 · 2021年4月2日

Delay-Tolerant Consensus-based Distributed Estimation: Full-Rank Systems with Potentially Unstable Dynamics

Arxiv

0+阅读 · 2021年4月1日

Decentralized and Model-Free Federated Learning: Consensus-Based Distillation in Function Space

Arxiv

0+阅读 · 2021年4月1日

On the Convergence Time of Federated Learning Over Wireless Networks Under Imperfect CSI

Arxiv

0+阅读 · 2021年4月1日

Robust Experimentation in the Continuous Time Bandit Problem

Arxiv

0+阅读 · 2021年3月31日

Generalized Conflict-directed Search for Optimal Ordering Problems

Arxiv

0+阅读 · 2021年3月31日

Contractivity of Runge-Kutta methods for convex gradient systems

Arxiv

0+阅读 · 2021年3月31日

Revisiting Bayesian Optimization in the light of the COCO benchmark

Arxiv

0+阅读 · 2021年3月30日

Disentangled Information Bottleneck

Disentangled Information Bottleneck

Arxiv

12+阅读 · 2020年12月22日

Learning Optimal Representations with the Decodable Information Bottleneck

Arxiv

6+阅读 · 2020年9月27日

微信扫码咨询专知VIP会员