Despite their popularity, to date, the application of normalizing flows on categorical data stays limited. The current practice of using dequantization to map discrete data to a continuous space is inapplicable as categorical data has no intrinsic order. Instead, categorical data have complex and latent relations that must be inferred, like the synonymy between words. In this paper, we investigate \emph{Categorical Normalizing Flows}, that is normalizing flows for categorical data. By casting the encoding of categorical data in continuous space as a variational inference problem, we jointly optimize the continuous representation and the model likelihood. Using a factorized decoder, we introduce an inductive bias to model any interactions in the normalizing flow. As a consequence, we do not only simplify the optimization compared to having a joint decoder, but also make it possible to scale up to a large number of categories that is currently impossible with discrete normalizing flows. Based on Categorical Normalizing Flows, we propose GraphCNF a permutation-invariant generative model on graphs. GraphCNF implements a three step approach modeling the nodes, edges and adjacency matrix stepwise to increase efficiency. On molecule generation, GraphCNF outperforms both one-shot and autoregressive flow-based state-of-the-art.
翻译:使用分解法绘制离散数据到连续空间的当前做法是不适用的,因为绝对数据没有内在的顺序。相反,绝对数据具有复杂和潜在的关系,必须加以推断,例如词际的同义词。在本文中,我们调查正使绝对数据流正常化的\ emph{Categorizalizalization curransions},这是目前离散正常化流所不可能的一大批类别。根据基于分类法的正常化流,我们建议GIGCNF在图表上采用一个变异式的CN变异模型。使用一个分解法解码器,我们引入一种向导偏向偏向性偏移,以模拟正常化流中的任何相互作用。因此,我们不仅简化了优化与联合解码器相比,而且还有可能将目前离异的正常流扩大到大量类别。我们建议GIAPCNF在连续空间中将绝对数据编码成一个变异式的遗传模型。在图形上,使用一个分解式解式解式解式解析法的模型,用一个步骤执行一个分步法式的模型,而双向式的模型。