Traditional sequential multi-object attention models rely on a recurrent mechanism to infer object relations. We propose a relational extension (R-SQAIR) of one such attention model (SQAIR) by endowing it with a module with strong relational inductive bias that computes in parallel pairwise interactions between inferred objects. Two recently proposed relational modules are studied on tasks of unsupervised learning from videos. We demonstrate gains over sequential relational mechanisms, also in terms of combinatorial generalization.
翻译:传统的相继多点注意模式依靠一种经常性机制来推断物体关系。我们建议对一种这种注意模式(SQAIR)进行关系延伸(R-SQAIR),方法是赋予该模式一个具有强烈关系感应偏差的模块,该模块在推断对象之间的平行对相向互动中进行计算。最近提出的两个关系模块研究的是未经监督的从视频中学习的任务。我们展示了相继关系机制的收益,同时也在组合式概括化方面。