调和分裂的持续存在的分歧:走向一个无孔不入的通信数学理论 (Reconciling the Discrete-Continuous Divide: Towards a Mathematical Theory of Sparse Communication)

Neural networks and other machine learning models compute continuous representations, while humans communicate with discrete symbols. Reconciling these two forms of communication is desirable to generate human-readable interpretations or to learn discrete latent variable models, while maintaining end-to-end differentiability. Some existing approaches (such as the Gumbel-softmax transformation) build continuous relaxations that are discrete approximations in the zero-temperature limit, while others (such as sparsemax transformations and the hard concrete distribution) produce discrete/continuous hybrids. In this paper, we build rigorous theoretical foundations for these hybrids. Our starting point is a new "direct sum" base measure defined on the face lattice of the probability simplex. From this measure, we introduce a new entropy function that includes the discrete and differential entropies as particular cases, and has an interpretation in terms of code optimality, as well as two other information-theoretic counterparts that generalize the mutual information and Kullback-Leibler divergences. Finally, we introduce "mixed languages" as strings of hybrid symbols and a new mixed weighted finite state automaton that recognizes a class of regular mixed languages, generalizing closure properties of regular languages.

翻译：神经网络和其他机器学习模型计算连续表达, 而人类则与离散符号进行通信。调和这两种通信形式是可取的, 以产生人类可读的解释, 或者学习离散潜伏变量模型, 同时保持端到端的差异性。一些现有的方法( 如 Gumber- softmax 转换) 建立连续的放松, 这些方法在零温限度内是离散近, 而另一些方法( 如稀疏的负轴转换和硬混凝土分布) 则产生离散/ 连续的混合体。在本文中, 我们为这些混合体建立严格的理论基础。我们的起点是一个新的“ 直接总和” 基度测量, 定义在概率简单x的面盘中。从此测量中, 我们引入了一个新的微积函数, 包括离散和差异的元素, 在零温度限制范围内, 以及另外两个信息- 理论对应方, 将共同的信息和 Kullback- Leiter 差异化。最后, 我们引入“ 混合语言”, 作为常规混合混合符号和新组合定定的自动状态等语言的字符。