Positional reasoning is the process of ordering unsorted parts contained in a set into a consistent structure. We present Positional Diffusion, a plug-and-play graph formulation with Diffusion Probabilistic Models to address positional reasoning. We use the forward process to map elements' positions in a set to random positions in a continuous space. Positional Diffusion learns to reverse the noising process and recover the original positions through an Attention-based Graph Neural Network. We conduct extensive experiments with benchmark datasets including two puzzle datasets, three sentence ordering datasets, and one visual storytelling dataset, demonstrating that our method outperforms long-lasting research on puzzle solving with up to +18% compared to the second-best deep learning method, and performs on par against the state-of-the-art methods on sentence ordering and visual storytelling. Our work highlights the suitability of diffusion models for ordering problems and proposes a novel formulation and method for solving various ordering tasks. Project website at https://iit-pavis.github.io/Positional_Diffusion/
翻译:位置推理是将包含在一个集合中的未排序部分排序成一致结构的过程。我们提出了位置扩散,一种通过扩散概率模型对图进行插拔式处理以解决位置推理的方法。我们使用前向过程将元素在集合中的位置映射到连续空间中的随机位置。位置扩散通过基于注意力机制的图神经网络学习反向复原位置噪音进行预测。我们对基准数据集进行了广泛的实验,包括两个难题数据集、三个句子排序数据集和一个视觉叙事数据集,证明了我们的方法能够在难题解决方面比第二佳深度学习方法提高多达18%的性能,并在句子排序和视觉叙事方面与最先进的方法持平。我们的工作强调了扩散模型在排序问题上的适用性,并提出了一种解决各种排序任务的新方法和公式。项目网站位于https://iit-pavis.github.io/Positional_Diffusion/。