Neural Processes (NPs) are a popular class of approaches for meta-learning. Similar to Gaussian Processes (GPs), NPs define distributions over functions and can estimate uncertainty in their predictions. However, unlike GPs, NPs and their variants suffer from underfitting and often have intractable likelihoods, which limit their applications in sequential decision making. We propose Transformer Neural Processes (TNPs), a new member of the NP family that casts uncertainty-aware meta learning as a sequence modeling problem. We learn TNPs via an autoregressive likelihood-based objective and instantiate it with a novel transformer-based architecture. The model architecture respects the inductive biases inherent to the problem structure, such as invariance to the observed data points and equivariance to the unobserved points. We further investigate knobs within the TNP framework that tradeoff expressivity of the decoding distribution with extra computation. Empirically, we show that TNPs achieve state-of-the-art performance on various benchmark problems, outperforming all previous NP variants on meta regression, image completion, contextual multi-armed bandits, and Bayesian optimization.
翻译:神经过程(NPs) 是一种流行的元学习方法。 与Gausian processes(GPs) 相似, NPs 定义了功能的分布,并可以估计其预测中的不确定性。 但是,与GPs不同, NPs及其变体存在不完善的问题,而且往往存在难以解决的可能性,从而限制了其在连续决策中的应用。 我们提议了变换神经过程(TNPs),这是NP家族的一个新成员,将不确定性和觉察元学习作为一种序列模型问题。 我们通过自动递增可能性目标学习TNPs, 并用新的变异器结构即时化。 模型结构尊重问题结构固有的诱导偏向性偏向,例如对观测到的数据点的变异性和对未观察到的点的变异性。 我们进一步调查TNP框架内的 knobs, 以额外的计算来权衡分解分布的明显偏向性。 我们想象到, TNPs在各种基准问题上取得了最先进的表现, 超越了以前所有背景的、 基数级的升级的图像。