Real-world data generation often involves complex inter-dependencies among instances, violating the IID-data hypothesis of standard learning paradigms and posing a challenge for uncovering the geometric structures for learning desired instance representations. To this end, we introduce an energy constrained diffusion model which encodes a batch of instances from a dataset into evolutionary states that progressively incorporate other instances' information by their interactions. The diffusion process is constrained by descent criteria w.r.t.~a principled energy function that characterizes the global consistency of instance representations over latent structures. We provide rigorous theory that implies closed-form optimal estimates for the pairwise diffusion strength among arbitrary instance pairs, which gives rise to a new class of neural encoders, dubbed as DIFFormer (diffusion-based Transformers), with two instantiations: a simple version with linear complexity for prohibitive instance numbers, and an advanced version for learning complex structures. Experiments highlight the wide applicability of our model as a general-purpose encoder backbone with superior performance in various tasks, such as node classification on large graphs, semi-supervised image/text classification, and spatial-temporal dynamics prediction.
翻译:真实世界数据生成往往涉及各种情况之间复杂的相互依存关系,违反了标准学习范式的ID-数据假设,对发现用于学习理想实例演示的几何结构提出了挑战。为此,我们引入了一种能源限制扩散模型,将数据集中的一组实例编码成进化状态,通过相互作用逐渐将其他实例的信息纳入进化状态。扩散过程受到下降标准rt. ~ 原则性能源功能的限制,该功能是潜在结构中实例表现全球一致性的特点。我们提供了严格的理论,这意味着对任意实例组合中的对称扩散强度进行封闭式最佳估计,从而产生了新型神经电解码器,称为DIFFORmer(基于扩散的变异器),有两种即简单版本,具有线性复杂性,用于引人注意实例的数量,以及学习复杂结构的高级版本。实验突出表明,我们模型作为通用的编码主干线骨,在各种任务中具有优异性,例如大图形的节化分类、半超前图像/文字分类以及空间动态预测。</s>