Extracting fine-grained features such as styles from unlabeled data is crucial for data analysis. Unsupervised methods such as variational autoencoders (VAEs) can extract styles that are usually mixed with other features. Conditional VAEs (CVAEs) can isolate styles using class labels; however, there are no established methods to extract only styles using unlabeled data. In this paper, we propose a CVAE-based method that extracts style features using only unlabeled data. The proposed model consists of a contrastive learning (CL) part that extracts style-independent features and a CVAE part that extracts style features. The CL model learns representations independent of data augmentation, which can be viewed as a perturbation in styles, in a self-supervised manner. Considering the style-independent features from the pretrained CL model as a condition, the CVAE learns to extract only styles. Additionally, we introduce a constraint based on mutual information between the CL and VAE features to prevent the CVAE from ignoring the condition. Experiments conducted using two simple datasets, MNIST and an original dataset based on Google Fonts, demonstrate that the proposed method can efficiently extract style features. Further experiments using real-world natural image datasets were also conducted to illustrate the method's extendability.
翻译:从未标记的数据中提取样式等细粒度特征是数据分析中至关重要的。无监督方法如变分自编码器(VAEs)能够提取样式,但通常会与其他特征混合在一起。条件化变分自编码器(CVAEs)可以使用类标签来分离样式;但是,目前没有既可以使用未标记数据又能提取样式的已建立方法。本文中,我们提出了一种基于CVAE的方法,仅使用未标记数据即可提取样式特征。所提出的模型包括一个对比学习(CL)部分,用于提取与样式无关的特征,以及一个CVAE部分,用于提取样式特征。CL模型以无需标签的自监督方式学习与数据增强无关的表征,这可以视为样式中的扰动。将CL模型学到的样式无关特征作为条件,CVAE学习仅提取样式。另外,我们引入了一个基于CL和VAE特征之间互信息的约束条件,以防止CVAE忽略条件。在两个简单数据集MNIST和基于Google字体的原始数据集上进行了实验,证明了所提出方法的有效性。此外,还进行了使用真实自然图像数据集进行的进一步实验,以说明该方法的可扩展性。