The data distribution commonly evolves over time leading to problems such as concept drift that often decrease classifier performance. We seek to predict unseen data (and their labels) allowing us to tackle challenges due to a non-constant data distribution in a \emph{proactive} manner rather than detecting and reacting to already existing changes that might already have led to errors. To this end, we learn a domain transformer in an unsupervised manner that allows generating data of unseen domains. Our approach first matches independently learned latent representations of two given domains obtained from an auto-encoder using a Cycle-GAN. In turn, a transformation of the original samples can be learned that can be applied iteratively to extrapolate to unseen domains. Our evaluation on CNNs on image data confirms the usefulness of the approach. It also achieves very good results on the well-known problem of unsupervised domain adaption, where labels but not samples have to be predicted.
翻译:数据分布通常会随着时间的演变而演变,从而导致诸如概念漂移等往往降低分类性能的问题。 我们试图预测不可见的数据(及其标签),从而使我们能够以不连续的数据分配方式应对挑战,而不是检测和应对可能已经导致错误的现有变化。 为此,我们以不受监督的方式学习了一个域变压器,从而能够生成未知域的数据。 我们的方法首先匹配了从使用循环GAN的自动编码器获得的两个特定域的独立学习的潜在表达方式。 反过来,我们可以学习原始样品的转换,这种转换可以迭接地应用于对未知域的外推。 我们对CNN图像数据的评价证实了这种方法的有用性。 在众所周知的未经监督的域调整问题上,它也取得了非常好的结果, 在那里,标签而不是样本是必须预测的。