We introduce semi-parametric inducing point networks (SPIN), a general-purpose architecture that can query the training set at inference time in a compute-efficient manner. Semi-parametric architectures are typically more compact than parametric models, but their computational complexity is often quadratic. In contrast, SPIN attains linear complexity via a cross-attention mechanism between datapoints inspired by inducing point methods. Querying large training sets can be particularly useful in meta-learning, as it unlocks additional training signal, but often exceeds the scaling limits of existing models. We use SPIN as the basis of the Inducing Point Neural Process, a probabilistic model which supports large contexts in meta-learning and achieves high accuracy where existing models fail. In our experiments, SPIN reduces memory requirements, improves accuracy across a range of meta-learning tasks, and improves state-of-the-art performance on an important practical problem, genotype imputation.
翻译:我们引入半参数感受野点网络(SPIN),这是一个通用的体系结构,可以在推理时间查询训练集。半参数体系结构通常比参数模型更紧凑,但它们的计算复杂度通常是二次的。相比之下,SPIN通过受感野点方法的数据点之间的交叉注意机制实现线性复杂度。查询大型训练集在元学习中特别有用,因为它可以解锁额外的训练信号,但往往超出了现有模型的扩展极限。我们使用SPIN作为感受野神经过程的基础,这是一个支持元学习中的大场景的概率模型,并在现有模型失败的情况下实现高准确度。在我们的实验中,SPIN降低了内存要求,提高了元学习任务的准确性,并在一个重要的实际问题,基因型填补方面提高了最先进的性能。