Neural Processes (NPs) are popular methods in meta-learning that can estimate predictive uncertainty on target datapoints by conditioning on a context dataset. Previous state-of-the-art method Transformer Neural Processes (TNPs) achieve strong performance but require quadratic computation with respect to the number of context datapoints, significantly limiting its scalability. Conversely, existing sub-quadratic NP variants perform significantly worse than that of TNPs. Tackling this issue, we propose Latent Bottlenecked Attentive Neural Processes (LBANPs), a new computationally efficient sub-quadratic NP variant, that has a querying computational complexity independent of the number of context datapoints. The model encodes the context dataset into a constant number of latent vectors on which self-attention is performed. When making predictions, the model retrieves higher-order information from the context dataset via multiple cross-attention mechanisms on the latent vectors. We empirically show that LBANPs achieve results competitive with the state-of-the-art on meta-regression, image completion, and contextual multi-armed bandits. We demonstrate that LBANPs can trade-off the computational cost and performance according to the number of latent vectors. Finally, we show LBANPs can scale beyond existing attention-based NP variants to larger dataset settings.
翻译:神经过程(NPs)是流行的元学习方法,它可以通过对上下文数据集进行调整来估计目标数据点的不确定性。 以前的先进方法变形神经过程(TNPs)取得强效,但要求对上下文数据点的数量进行二次计算,从而大大限制其可缩放性。 相反,现有的次横向NP变方的功能比TNP的差得多。 解决这个问题,我们提议使用一种新的计算高效的次赤道NP变异,即目标数据点的预测不确定性。 以前的先进方法变形神经过程(TNPs)取得了很强的性能,它具有与上下文数据点数无关的计算复杂性。 模型将上下文数据集编码成一个固定数量的潜在矢量,进行自我注意。 在作出预测时,模型通过对潜在矢量的多重交叉关注机制从上检索到上下文数据集的更高顺序信息。 我们的实验表明,LBANPs在超越上方和下方矢量的次矢量的亚向下方贸易的状态和下层结构中,我们能够显示上层- 度的行进进进度、 度的行进进进度图像图像图像图像到最后显示系统。