减少监督代表学习的深层内容 (Deep Dimension Reduction for Supervised Representation Learning)

The goal of supervised representation learning is to construct effective data representations for prediction. Among all the characteristics of an ideal nonparametric representation of high-dimensional complex data, sufficiency, low dimensionality and disentanglement are some of the most essential ones. We propose a deep dimension reduction approach to learning representations with these characteristics. The proposed approach is a nonparametric generalization of the sufficient dimension reduction method. We formulate the ideal representation learning task as that of finding a nonparametric representation that minimizes an objective function characterizing conditional independence and promoting disentanglement at the population level. We then estimate the target representation at the sample level nonparametrically using deep neural networks. We show that the estimated deep nonparametric representation is consistent in the sense that its excess risk converges to zero. Our extensive numerical experiments using simulated and real benchmark data demonstrate that the proposed methods have better performance than several existing dimension reduction methods and the standard deep learning models in the context of classification and regression.

翻译：监督代表制学习的目的是为预测建立有效的数据表示方式; 高维复杂数据、充足性、低维度和分解等理想的非参数表示方式的所有特点都是一些最基本的特征; 我们提议了一种深度减少维度的方法,以了解具有这些特征的表述方式; 提议的方法是对充分的减少维度方法进行非对称的概括化; 我们制定了理想的代表性学习任务,以找到一种非对称的表示方式,以最大限度地减少有条件独立这一客观功能的特征,并促进人口层面的分解; 然后利用深神经网络对抽样层的目标表示方式进行非对称性估计; 我们表明,估计的深度非参数表示方式是一致的,因为其超重风险将集中到零。我们利用模拟和实际基准数据进行的大量数字实验表明,在分类和回归方面,拟议方法的性能优于现有的若干减少维度方法和标准深层学习模型。

相关内容

表示学习

关注 186

表示学习是通过利用训练数据来学习得到向量表示，这可以克服人工方法的局限性。表示学习通常可分为两大类，无监督和有监督表示学习。大多数无监督表示学习方法利用自动编码器（如去噪自动编码器和稀疏自动编码器等）中的隐变量作为表示。目前出现的变分自动编码器能够更好的容忍噪声和异常值。然而，推断给定数据的潜在结构几乎是不可能的。目前有一些近似推断的策略。此外，一些无监督表示学习方法旨在近似某种特定的相似性度量。提出了一种无监督的相似性保持表示学习框架，该框架使用矩阵分解来保持成对的DTW相似性。通过学习保持DTW的shaplets，即在转换后的空间中的欧式距离近似原始数据的真实DTW距离。有监督表示学习方法可以利用数据的标签信息，更好地捕获数据的语义结构。孪生网络和三元组网络是目前两种比较流行的模型，它们的目标是最大化类别之间的距离并最小化了类别内部的距离。

【ICML2021】核持续学习，Kernel Continual Learning

专知会员服务

32+阅读 · 2021年7月15日

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

专知会员服务

74+阅读 · 2020年7月6日

【DeepMind深度学习课程】无监督表示学习前沿进展，129页ppt，Unsupervised Representation Learning

专知会员服务

79+阅读 · 2020年6月29日