Contrastive learning is a family of self-supervised methods where a model is trained to solve a classification task constructed from unlabeled data. It has recently emerged as one of the leading learning paradigms in the absence of labels across many different domains (e.g. brain imaging, text, images). However, theoretical understanding of many aspects of training, both statistical and algorithmic, remain fairly elusive. In this work, we study the setting of time series -- more precisely, when we get data from a strong-mixing continuous-time stochastic process. We show that a properly constructed contrastive learning task can be used to estimate the transition kernel for small-to-mid-range intervals in the diffusion case. Moreover, we give sample complexity bounds for solving this task and quantitatively characterize what the value of the contrastive loss implies for distributional closeness of the learned kernel. As a byproduct, we illuminate the appropriate settings for the contrastive distribution, as well as other hyperparameters in this setup.
翻译:反向学习是一个由自我监督的方法组成的组合, 模型在这种组合中接受培训, 以解决用未贴标签的数据构建的分类任务。 最近, 模型在很多不同领域( 如脑成像、文本、图像)没有标签的情况下, 成为了主要的学习范例之一。 然而, 对培训的许多方面的理论理解, 统计和算法, 仍然相当难以理解。 在这项工作中, 我们研究时间序列的设置 -- 更精确地说, 当我们从一个紧密混合的连续时间随机分析过程中获得数据时。 我们显示, 在扩散案例中, 可以用一个结构得当的对比学习任务来估计小到中间隔的过渡内核。 此外, 我们给解决这项任务的样本复杂度提供了样本, 并用数量来描述对比性损失对所学内核分布接近值意味着什么。 作为副产品, 我们为对比分布的合适环境以及这个设置中的其他超光度计进行演示。