Many machine learning techniques incorporate identity-preserving transformations into their models to generalize their performance to previously unseen data. These transformations are typically selected from a set of functions that are known to maintain the identity of an input when applied (e.g., rotation, translation, flipping, and scaling). However, there are many natural variations that cannot be labeled for supervision or defined through examination of the data. As suggested by the manifold hypothesis, many of these natural variations live on or near a low-dimensional, nonlinear manifold. Several techniques represent manifold variations through a set of learned Lie group operators that define directions of motion on the manifold. However, these approaches are limited because they require transformation labels when training their models and they lack a method for determining which regions of the manifold are appropriate for applying each specific operator. We address these limitations by introducing a learning strategy that does not require transformation labels and developing a method that learns the local regions where each operator is likely to be used while preserving the identity of inputs. Experiments on MNIST and Fashion MNIST highlight our model's ability to learn identity-preserving transformations on multi-class datasets. Additionally, we train on CelebA to showcase our model's ability to learn semantically meaningful transformations on complex datasets in an unsupervised manner.
翻译:许多机器学习技术在其模型中加入了保持身份不变的变换以将其性能推广到以前未见的数据。这些变换通常从一组已知可以维持输入恒等的函数中选择(例如旋转、平移、翻转和缩放)。然而,存在许多自然变化无法进行标记或通过对数据的检查进行定义。正如流形假设所建议的那样,许多这种自然的变化发生在或接近低维非线性流形上。其中几种技术通过一组学习的李群算子来表示流形变化,这些算子定义了流形上的运动方向。然而,这些方法存在局限性,因为在训练模型时需要转换标签,并且缺乏确定哪些流形区域适合应用每个特定算子的方法。我们通过引入一种不需要转换标签的学习策略和开发一种方法来学习每个算子可能使用的局部区域,同时保持输入的身份不变来解决这些限制。在MNIST和Fashion MNIST上的实验突显了我们的模型在多类数据集上学习保持身份不变的变换的能力。此外,我们还使用CelebA进行训练,以展示我们的模型在无监督的情况下学习复杂数据集上的语义有意义的变换的能力。