For a number of machine learning problems, an exogenous variable can be identified such that it heavily influences the appearance of the different classes, and an ideal classifier should be invariant to this variable. An example of such exogenous variable is identity if facial expression recognition (FER) is considered. In this paper, we propose a dual exogenous/endogenous representation. The former captures the exogenous variable whereas the second one models the task at hand (e.g. facial expression). We design a prediction layer that uses a tree-gated deep ensemble conditioned by the exogenous representation. We also propose an exogenous dispelling loss to remove the exogenous information from the endogenous representation. Thus, the exogenous information is used two times in a throwable fashion, first as a conditioning variable for the target task, and second to create invariance within the endogenous representation. We call this method THIN, standing for THrowable Information Networks. We experimentally validate THIN in several contexts where an exogenous information can be identified, such as digit recognition under large rotations and shape recognition at multiple scales. We also apply it to FER with identity as the exogenous variable. We demonstrate that THIN significantly outperforms state-of-the-art approaches on several challenging datasets.
翻译:对于一些机器学习问题,可以确定一个外源变量,这样它会严重影响不同类别外观的外源变量,而理想的分类器应该对这个变量不起作用。这种外源变量的一个实例是,如果考虑面部表达识别(FER),则身份特征。在本文中,我们提出一种双重外源/异源代表制。前者捕捉外源变量,而第二种模式是手头的任务(如面部表达式)。我们设计一个预测层,使用树形的深层混合体,由外源代表制成。我们还提议一种外源消除损失的外源信息,将外源信息从内源代表制中删除。因此,外源信息以可投掷的方式使用两次,首先作为目标表达式的调节变量,其次是在内源代表制内产生差异。我们称之为THIN方法,站在可传输的信息网络上。我们试验性地验证THIN,在几种情况下可以识别外源信息,例如大规模旋转下的数字化识别,并在多个尺度上形成识别度。我们还将它应用到带有外源源源变量的外源源信息。我们展示了各种方法。