学习解析文本代表的相互信息新动画 (A Novel Estimator of Mutual Information for Learning to Disentangle Textual Representations)

Learning disentangled representations of textual data is essential for many natural language tasks such as fair classification, style transfer and sentence generation, among others. The existent dominant approaches in the context of text data {either rely} on training an adversary (discriminator) that aims at making attribute values difficult to be inferred from the latent code {or rely on minimising variational bounds of the mutual information between latent code and the value attribute}. {However, the available methods suffer of the impossibility to provide a fine-grained control of the degree (or force) of disentanglement.} {In contrast to} {adversarial methods}, which are remarkably simple, although the adversary seems to be performing perfectly well during the training phase, after it is completed a fair amount of information about the undesired attribute still remains. This paper introduces a novel variational upper bound to the mutual information between an attribute and the latent code of an encoder. Our bound aims at controlling the approximation error via the Renyi's divergence, leading to both better disentangled representations and in particular, a precise control of the desirable degree of disentanglement {than state-of-the-art methods proposed for textual data}. Furthermore, it does not suffer from the degeneracy of other losses in multi-class scenarios. We show the superiority of this method on fair classification and on textual style transfer tasks. Additionally, we provide new insights illustrating various trade-offs in style transfer when attempting to learn disentangled representations and quality of the generated sentence.

翻译：文本数据的解析表达方式对于许多自然语言任务至关重要,例如公平分类、风格传输和生成句子等。文本数据中的现有主导方法非常简单,尽管在培训阶段对手似乎表现得非常好,但在完成关于不理想属性的公平信息后,仍然保留着。本文介绍了一个潜在代码和值属性之间相互信息之间的新变式上限。{然而,现有方法也存在,无法提供精细的准确控制,即解析程度(或力)的准确性。}{与{对抗方法}相对照,尽管在培训阶段,对手似乎表现得非常好。本文介绍了一个潜在代码和值属性之间相互信息之间的新变式上限。我们的结合目的是通过Reny的变异性来控制近似错误,导致更不连贯的表达方式,特别是,在排序过程中准确控制了关于不理想属性属性的信息转换方式的变异性。我们没有在排序中提供更精确的变式的排序,而是在排序中的排序中,我们没有准确地控制了排序中的排序。