In this work, we propose an information theory based framework DeepMI to train deep neural networks (DNN) using Mutual Information (MI). The DeepMI framework is especially targeted but not limited to the learning of real world tasks in an unsupervised manner. The primary motivation behind this work is the insufficiency of traditional loss functions for unsupervised task learning. Moreover, directly using MI for the training purpose is quite challenging to deal because of its unbounded above nature. Hence, we develop an alternative linearized representation of MI as a part of the framework. Contributions of this paper are three fold: i) investigation of MI to train deep neural networks, ii) novel loss function LLMI, and iii) a fuzzy logic based end-to-end differentiable pipeline to integrate DeepMI into deep learning framework. We choose a few unsupervised learning tasks for our experimental study. We demonstrate that L LM I alone provides better gradients to achieve a neural network better performance over the cases when multiple loss functions are used for a given task.
翻译:在这项工作中,我们提出了一个信息理论框架DeepMI, 利用相互信息培训深神经网络(DNN) DeepMI。DeepMI框架特别有针对性,但不限于以不受监督的方式学习现实世界的任务。这项工作的主要动机是传统损失功能不足以用于不受监督的任务学习。此外,直接使用MI进行培训非常困难,因为其不受约束的上层性质。因此,我们开发了MI的线性描述作为框架的一部分。本文的贡献有三倍:i)调查MI以培训深神经网络,ii)新损失函数LLMI,以及iii)基于末至终端不同管道的模糊逻辑,以便将DeepMI纳入深层学习框架。我们为实验性研究选择了少数未受监督的学习任务。我们证明LLM I本身提供了更好的梯度,以便在多重损失功能被用于某项任务的情况下实现更好的神经网络表现。