Processing sets or other unordered, potentially variable-sized inputs in neural networks is usually handled by aggregating a number of input tensors into a single representation. While a number of aggregation methods already exist from simple sum pooling to multi-head attention, they are limited in their representational power both from theoretical and empirical perspectives. On the search of a principally more powerful aggregation strategy, we propose an optimization-based method called Equilibrium Aggregation. We show that many existing aggregation methods can be recovered as special cases of Equilibrium Aggregation and that it is provably more efficient in some important cases. Equilibrium Aggregation can be used as a drop-in replacement in many existing architectures and applications. We validate its efficiency on three different tasks: median estimation, class counting, and molecular property prediction. In all experiments, Equilibrium Aggregation achieves higher performance than the other aggregation techniques we test.
翻译:在神经网络中,处理机组或其他未经排序的、潜在变小的投入通常通过将若干输入数加到一个代表形式来处理。虽然已经存在一些集成方法,从简单的总合到多头关注,但从理论和经验角度看,这些集成方法的代表性有限。在寻找一个主要更强大的集成战略时,我们建议采用一个以优化为基础的方法,称为“平衡聚合”。我们表明,许多现有的集成方法可以作为平衡聚合的特殊案例被回收,在某些重要案例中,这种方法的效率更高。平衡聚合可以在许多现有建筑和应用中用作一个滴入式替代。我们验证其三个不同任务的效率:中位估计、分类计算和分子属性预测。在所有实验中,平衡聚合都比我们测试的其他汇总技术取得更高的性能。