通过混合分发的简简通讯 (Sparse Communication via Mixed Distributions) - 专知论文

会员服务 ·

0

离散化 · Continuity · 潜变量/隐变量 · MoDELS · 单纯形 ·

2021 年 8 月 5 日

Sparse Communication via Mixed Distributions

翻译：通过混合分发的简简通讯

António Farinhas,Wilker Aziz,Vlad Niculae,André F. T. Martins

Neural networks and other machine learning models compute continuous representations, while humans communicate mostly through discrete symbols. Reconciling these two forms of communication is desirable for generating human-readable interpretations or learning discrete latent variable models, while maintaining end-to-end differentiability. Some existing approaches (such as the Gumbel-Softmax transformation) build continuous relaxations that are discrete approximations in the zero-temperature limit, while others (such as sparsemax transformations and the Hard Concrete distribution) produce discrete/continuous hybrids. In this paper, we build rigorous theoretical foundations for these hybrids, which we call "mixed random variables." Our starting point is a new "direct sum" base measure defined on the face lattice of the probability simplex. From this measure, we introduce new entropy and Kullback-Leibler divergence functions that subsume the discrete and differential cases and have interpretations in terms of code optimality. Our framework suggests two strategies for representing and sampling mixed random variables, an extrinsic ("sample-and-project") and an intrinsic one (based on face stratification). We experiment with both approaches on an emergent communication benchmark and on modeling MNIST and Fashion-MNIST data with variational auto-encoders with mixed latent variables.

翻译：神经网络和其他机器学习模型可以计算连续的表达, 而人类则主要通过离散符号进行交流。调和这两种形式的交流,对于产生人类可读的解释或学习离散的潜伏变量模型是可取的, 同时保持端到端到端的差异性。一些现有的方法( 如 Gumbel- Softmax 转换) 建立连续的放松, 这些方法在零温限度内是离散的近似值, 而另一些方法( 如稀释式移动和硬体分布) 则产生离散/ 连续的混合体。在本文件中, 我们为这些混合体建立了严格的理论基础, 我们称之为“ 混合随机变量 ” 。我们的起点是一个新的“ 直接总” 基度测量值, 定义在概率简单x的面宽度上。从这个尺度中, 我们引入新的诱导和 Kullback- Leback- Leiber 差异功能, 以代码的最佳性为分解。我们的框架建议了两种战略, 代表并取样混合随机变量、外延( “ am- proad- prout- prout) commal- missionalfilstal constrational rofactal roduction 。

0

相关内容

离散化

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

53+阅读 · 2020年9月7日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

【CMU-Spring2020课程】离散微分几何15讲，Discrete Differential Geometry

【CMU-Spring2020课程】离散微分几何15讲，Discrete Differential Geometry

专知会员服务

55+阅读 · 2020年3月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

196+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图与推荐

10+阅读 · 2020年3月28日

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】Python机器学习生态圈(Scikit-Learn相关项目)

【推荐】Python机器学习生态圈(Scikit-Learn相关项目)

机器学习研究会

6+阅读 · 2017年8月23日

【推荐】SLAM相关资源大列表

【推荐】SLAM相关资源大列表

机器学习研究会

10+阅读 · 2017年8月18日

S2 Reducer: High-Performance Sparse Communication to Accelerate Distributed Deep Learning

Arxiv

0+阅读 · 2021年10月5日

Identifiability in Exact Two-Layer Sparse Matrix Factorization

Arxiv

0+阅读 · 2021年10月4日

Identifiability in Exact Multilayer Sparse Matrix Factorization

Arxiv

0+阅读 · 2021年10月4日

Distributed ARIMA Models for Ultra-long Time Series

Arxiv

0+阅读 · 2021年10月2日

On the curvatures of Gaussian random field manifolds

Arxiv

0+阅读 · 2021年10月2日

Approximate Regions of Attraction in Learning with Decision-Dependent Distributions

Arxiv

0+阅读 · 2021年10月1日

Distributed Estimation of Sparse Inverse Covariances

Arxiv

0+阅读 · 2021年9月30日

Oops I Took A Gradient: Scalable Sampling for Discrete Distributions

Arxiv

3+阅读 · 2021年6月6日

Deep Stable Learning for Out-Of-Distribution Generalization

Arxiv

12+阅读 · 2021年4月16日

Attributed Network Embedding via Subspace Discovery

Arxiv

4+阅读 · 2019年1月14日

VIP会员

文章信息

相关主题

潜变量/隐变量

相关VIP内容

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

53+阅读 · 2020年9月7日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

【CMU-Spring2020课程】离散微分几何15讲，Discrete Differential Geometry

【CMU-Spring2020课程】离散微分几何15讲，Discrete Differential Geometry

专知会员服务

55+阅读 · 2020年3月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

196+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《生成式人工智能与大/小语言模型在供应链管理决策优化与可持续性提升中的作用评估》最新51页

白宫发布《赢得AI竞赛：美国人工智能行动计划》最新28页

地下战：地下空间的战略博弈

《美地下作战条令手册》228页

相关资讯

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图与推荐

10+阅读 · 2020年3月28日

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】Python机器学习生态圈(Scikit-Learn相关项目)

【推荐】Python机器学习生态圈(Scikit-Learn相关项目)

机器学习研究会

6+阅读 · 2017年8月23日

【推荐】SLAM相关资源大列表

【推荐】SLAM相关资源大列表

机器学习研究会

10+阅读 · 2017年8月18日

相关论文

S2 Reducer: High-Performance Sparse Communication to Accelerate Distributed Deep Learning

Arxiv

0+阅读 · 2021年10月5日

Identifiability in Exact Two-Layer Sparse Matrix Factorization

Arxiv

0+阅读 · 2021年10月4日

Identifiability in Exact Multilayer Sparse Matrix Factorization

Arxiv

0+阅读 · 2021年10月4日

Distributed ARIMA Models for Ultra-long Time Series

Arxiv

0+阅读 · 2021年10月2日

On the curvatures of Gaussian random field manifolds

Arxiv

0+阅读 · 2021年10月2日

Approximate Regions of Attraction in Learning with Decision-Dependent Distributions

Arxiv

0+阅读 · 2021年10月1日

Distributed Estimation of Sparse Inverse Covariances

Arxiv

0+阅读 · 2021年9月30日

Oops I Took A Gradient: Scalable Sampling for Discrete Distributions

Arxiv

3+阅读 · 2021年6月6日

Deep Stable Learning for Out-Of-Distribution Generalization

Arxiv

12+阅读 · 2021年4月16日

Attributed Network Embedding via Subspace Discovery

Arxiv

4+阅读 · 2019年1月14日

微信扫码咨询专知VIP会员