Sorting and ranking supervision is a method for training neural networks end-to-end based on ordering constraints. That is, the ground truth order of sets of samples is known, while their absolute values remain unsupervised. For that, we propose differentiable sorting networks by relaxing their pairwise conditional swap operations. To address the problems of vanishing gradients and extensive blurring that arise with larger numbers of layers, we propose mapping activations to regions with moderate gradients. We consider odd-even as well as bitonic sorting networks, which outperform existing relaxations of the sorting operation. We show that bitonic sorting networks can achieve stable training on large input sets of up to 1024 elements.
翻译:排序和排序监督是一种基于订单限制来培训神经网络端对端的方法。 也就是说, 一组样本的地面真实顺序是已知的, 而其绝对值仍然不受监督。 为此, 我们建议通过放松其配对条件互换操作, 进行不同的分类网络。 为了解决梯度消失和大量层产生广泛模糊的问题, 我们建议为有中等梯度的区域绘制启动程序。 我们考虑奇异的和比特式的分类网络, 这些网络比分类操作中现有的松散程度要强。 我们显示, 比特式分类网络可以实现对高达1024个元素的大型输入数据集的稳定培训 。