用于多视图二进制集束的图形合作自动编码器散列 (Graph-Collaborated Auto-Encoder Hashing for Multi-view Binary Clustering)

Unsupervised hashing methods have attracted widespread attention with the explosive growth of large-scale data, which can greatly reduce storage and computation by learning compact binary codes. Existing unsupervised hashing methods attempt to exploit the valuable information from samples, which fails to take the local geometric structure of unlabeled samples into consideration. Moreover, hashing based on auto-encoders aims to minimize the reconstruction loss between the input data and binary codes, which ignores the potential consistency and complementarity of multiple sources data. To address the above issues, we propose a hashing algorithm based on auto-encoders for multi-view binary clustering, which dynamically learns affinity graphs with low-rank constraints and adopts collaboratively learning between auto-encoders and affinity graphs to learn a unified binary code, called Graph-Collaborated Auto-Encoder Hashing for Multi-view Binary Clustering (GCAE). Specifically, we propose a multi-view affinity graphs learning model with low-rank constraint, which can mine the underlying geometric information from multi-view data. Then, we design an encoder-decoder paradigm to collaborate the multiple affinity graphs, which can learn a unified binary code effectively. Notably, we impose the decorrelation and code balance constraints on binary codes to reduce the quantization errors. Finally, we utilize an alternating iterative optimization scheme to obtain the multi-view clustering results. Extensive experimental results on $5$ public datasets are provided to reveal the effectiveness of the algorithm and its superior performance over other state-of-the-art alternatives.

翻译：由于大规模数据爆炸性增长,未经监督的散列方法已引起广泛关注,大型数据爆炸性增长,这可以通过学习紧凑的二进制代码大大减少储存和计算。现有的未经监督的散列方法试图利用来自样本的宝贵信息,而样本没有考虑到未贴标签的样本的本地几何结构。此外,基于自动编码的散列旨在尽量减少输入数据和二进制代码之间的重建损失,而多源数据的潜在一致性和互补性被忽视。为了解决上述问题,我们建议采用基于多视图双进制的自动编码的仓列算法,以多视图双进制二进制组合为基础,动态地学习具有低级限制的近似性图表,并采用在自动编码和亲近性图之间合作学习统一的二进制代码。我们设计了一个数字组合的自动编码,用多视图二进制组合(GCAE)。我们建议采用多视角的近相近性图表,用低级缩度图表来学习低层次的状态模型,这可以从多视图的硬化数据中提取基础的多进度数据。我们设计了一个高级的硬度数据,然后,我们设计了一个数字的硬化的硬化的硬化的硬化数据,然后,我们设计了一个共同的硬化的硬化的硬化的硬化规则,我们设计,可以用来学习的硬化的硬化的硬化的硬化的硬化数据。