深度自进化聚类:Deep Self-Evolution Clustering

2019 年 4 月 13 日 我爱读PAMI
深度自进化聚类:Deep Self-Evolution Clustering

本文说的是聚类,但是转化成了相似度对比的问题。首先就是用个深度模型来判断是不是相似,而模型学习的标号又是通过相似度加俩阈值得到的,最终通过一起来学习模型参数和阈值而进行聚类。作者是自动化所的大神Jianlong Chang。




Deep Self-Evolution Clustering

Jianlong Chang ; Gaofeng Meng ; Lingfeng Wang ; Shiming Xiang ; Chunhong Pan

IEEE Transactions on Pattern Analysis and Machine Intelligence

Year: 2019 , ( Early Access )

Pages: 1 – 1


Clustering is a crucial but challenging task in pattern analysis and machine learning. Existing methods often ignore

the combination between representation learning and clustering. To tackle this problem, we reconsider the clustering task from its

definition to develop Deep Self-Evolution Clustering (DSEC) to jointly learn representations and cluster data. For this purpose, the

clustering task is recast as a binary pairwise-classification problem to estimate whether pairwise patterns are similar. Specifically,

similarities between pairwise patterns are defined by the dot product between indicator features which are generated by a deep

neural network (DNN). To learn informative representations for clustering, clustering constraints are imposed on the indicator

features to represent specific concepts with specific representations. Since the ground-truth similarities are unavailable in

clustering, an alternating iterative algorithm called Self-Evolution Clustering Training (SECT) is presented to select similar and

dissimilar pairwise patterns and to train the DNN alternately. Consequently, the indicator features tend to be one-hot vectors and

the patterns can be clustered by locating the largest response of the learned indicator features. Extensive experiments strongly

evidence that DSEC outperforms current models on twelve popular image, text and audio datasets consistently



登录查看更多
11

相关内容

  中国科学院自动化研究所(以下简称自动化所)成立于195610月,是我国最早成立的国立自动化研究机构和最早开展类脑智能研究的国立研究机构。自动化所同时是中国科学院率先布局成立的“人工智能创新研究院”的总体牵头单位,“脑科学与智能技术卓越创新中心”的依托单位之一,也是国内外首个“人工智能学院”牵头承办单位,具有从智能机理、智能芯片、智能算法到智能系统完整的学科分布和优势领域。    

  六十多年来,自动化所为我国国民经济建设、社会进步、科技发展和国家安全做出了重要的贡献。建国发展初期,自动化所开拓了我国的控制科学,为“两弹一星”做出了历史性的贡献;改革开放年代,自动化所开创了我国模式识别智能信息处理的新领域;1990年代,自动化所以控制科学为基础,率先布局了人工智能研究;2010年起,自动化所率先布局类脑智能研究;2018年,自动化所开启自主进化智能研究的新格局。 

  自动化所现设科研开发部门14个,包括模式识别国家重点实验室、复杂系统管理与控制国家重点实验室、国家专用集成电路设计工程技术研究中心、中国科学院分子影像重点实验室、脑网络组研究中心等科研部门。还有若干与国际和社会其他创新单元共建的各类联合实验室和工程中心。 

  2018年底,自动化所共有在职职工898人。其中科技人员696人,包括中国科学院院士2人、发展中国家科学院院士1人、研究员及正高级工程技术人员103人、副研究员及高级工程技术人员221人;共有国家海外高层次人才引进计划(“千人计划”)入选者1人,青年千人计划入选者1人;中国科学院“百人计划”入选者23人(新增2人);IEEE Fellow9人(新增3人);国家杰出青年科学基金获得者14人,“万人计划”入选者11人(新增5人);百千万人才工程入选者10人,科技部中青年科技领军人才5人(新增3人),国家优秀青年基金获得者5人。 

  自动化研究所是1981年国务院学位委员会批准的博士、硕士学位授予权单位之一,现设有控制科学与工程等1个一级学科博士研究生培养点,计算机应用技术等1个二级学科博士研究生培养点,并设有控制科学与工程等1个一级学科博士后流动站,共有在学研究生722人(其中硕士生273人、博士生449人)。在站博士后81人。 

  自动化所长期坚持“智能科学与技术”研究,在生物特征识别、机器学习、视觉计算、自然语言处理、智能机器人和智能芯片等领域形成了系统的理论方法和体系,并取得丰富的研究成果;已形成从原始创新、核心关键技术研发到技术转移转化的智能技术生态,正在迈入国际上智能科学与技术领域具有重要影响的战略高技术研究机构。

   近年来,自动化所共获得省部级以上奖励30余项。发表论文数量逐年增加,质量不断提高;专利申请和授权量连年攀升,多年位居北京市科研系统前十名。绘制的“脑网络组图谱”第一次建立了宏观尺度上的活体全脑连接图谱,获得国际同行的广泛关注和好评;量化神经处理器(QNPU)通过自主创新的架构设计和神经网络优化技术,首次在资源受限的芯片上实现了大规模深度神经网络的独立计算,处于业界领先水平。生物特征识别技术实现了从中距离到远距离的可识别生物特征信息(虹膜-人脸-步态)全覆盖,研制成功一系列自主知识产权的远距离虹膜人脸多模态身份识别产品,在国家重要安全领域推广应用,相关技术入选2018年度“十大技术突破”;音智能处理整体解决方案已经受过大规模实际应用检验,系统接口已成为行业标准;基于自动化所语音识别技术的“紫冬语音云”在淘宝、来往等阿里巴巴旗下移动客户端产品中得到推广;“分子影像手术导航系统”通过国家药监局医疗器械安全性及有效性检测认证并进入临床应用;“仿生机器鱼高效与高机动控制的理论与方法”获得2017年度国家自然科学奖二等奖,研制的机器海豚实现了1.5倍体长的最高直线游速,并在国际上首次实现了机器海豚完全跃出水面;“智能视频监控技术”和“人脸识别技术”分别成功应用于2008年北京奥运会、2010年上海世博会的安保工作中,为社会安全贡献自己的力量;研制的AI程序“CASIA-先知1.0”采用知识和数据混合驱动的体系架构,在2017首届全国兵棋推演大赛总决赛中71的悬殊比分战胜人类顶级选手,展示了人工智能技术在博弈对抗领域的强大实力;与中国日报社合作构建“全球媒体云”综合平台,受到广泛好评;研制的电子光学玻璃印刷全自动AOI智能检测设备,可全面监控丝印关键制程品质情况,实现整个丝印工艺的全自动化生产,该技术一举填补了电子玻璃行业空白;“基于ACP方法的石化企业智能管理系统及应用”先后应用于茂名石化、齐鲁石化,为实现企业生产管理的精细化提供了有效的工具,并荣获“中国石油与化工自动化行业科技进步一等奖”……  

  新的征程上,中国科学院自动化研究所努力创建规范高效、民主和谐、环境优美、具有强大科技创新和可持续发展能力的国际知名的国家研究所,为我国科技事业的发展、为全面建设小康社会做出新的更大的贡献! 

Clustering is one of the most fundamental and wide-spread techniques in exploratory data analysis. Yet, the basic approach to clustering has not really changed: a practitioner hand-picks a task-specific clustering loss to optimize and fit the given data to reveal the underlying cluster structure. Some types of losses---such as k-means, or its non-linear version: kernelized k-means (centroid based), and DBSCAN (density based)---are popular choices due to their good empirical performance on a range of applications. Although every so often the clustering output using these standard losses fails to reveal the underlying structure, and the practitioner has to custom-design their own variation. In this work we take an intrinsically different approach to clustering: rather than fitting a dataset to a specific clustering loss, we train a recurrent model that learns how to cluster. The model uses as training pairs examples of datasets (as input) and its corresponding cluster identities (as output). By providing multiple types of training datasets as inputs, our model has the ability to generalize well on unseen datasets (new clustering tasks). Our experiments reveal that by training on simple synthetically generated datasets or on existing real datasets, we can achieve better clustering performance on unseen real-world datasets when compared with standard benchmark clustering techniques. Our meta clustering model works well even for small datasets where the usual deep learning models tend to perform worse.

0
14
下载
预览

Graph convolutional network (GCN) has been successfully applied to many graph-based applications; however, training a large-scale GCN remains challenging. Current SGD-based algorithms suffer from either a high computational cost that exponentially grows with number of GCN layers, or a large space requirement for keeping the entire graph and the embedding of each node in memory. In this paper, we propose Cluster-GCN, a novel GCN algorithm that is suitable for SGD-based training by exploiting the graph clustering structure. Cluster-GCN works as the following: at each step, it samples a block of nodes that associate with a dense subgraph identified by a graph clustering algorithm, and restricts the neighborhood search within this subgraph. This simple but effective strategy leads to significantly improved memory and computational efficiency while being able to achieve comparable test accuracy with previous algorithms. To test the scalability of our algorithm, we create a new Amazon2M data with 2 million nodes and 61 million edges which is more than 5 times larger than the previous largest publicly available dataset (Reddit). For training a 3-layer GCN on this data, Cluster-GCN is faster than the previous state-of-the-art VR-GCN (1523 seconds vs 1961 seconds) and using much less memory (2.2GB vs 11.2GB). Furthermore, for training 4 layer GCN on this data, our algorithm can finish in around 36 minutes while all the existing GCN training algorithms fail to train due to the out-of-memory issue. Furthermore, Cluster-GCN allows us to train much deeper GCN without much time and memory overhead, which leads to improved prediction accuracy---using a 5-layer Cluster-GCN, we achieve state-of-the-art test F1 score 99.36 on the PPI dataset, while the previous best result was 98.71 by [16]. Our codes are publicly available at https://github.com/google-research/google-research/tree/master/cluster_gcn.

0
9
下载
预览

The unsupervised text clustering is one of the major tasks in natural language processing (NLP) and remains a difficult and complex problem. Conventional \mbox{methods} generally treat this task using separated steps, including text representation learning and clustering the representations. As an improvement, neural methods have also been introduced for continuous representation learning to address the sparsity problem. However, the multi-step process still deviates from the unified optimization target. Especially the second step of cluster is generally performed with conventional methods such as k-Means. We propose a pure neural framework for text clustering in an end-to-end manner. It jointly learns the text representation and the clustering model. Our model works well when the context can be obtained, which is nearly always the case in the field of NLP. We have our method \mbox{evaluated} on two widely used benchmarks: IMDB movie reviews for sentiment classification and $20$-Newsgroup for topic categorization. Despite its simplicity, experiments show the model outperforms previous clustering methods by a large margin. Furthermore, the model is also verified on English wiki dataset as a large corpus.

0
6
下载
预览

Clustering is an essential data mining tool that aims to discover inherent cluster structure in data. For most applications, applying clustering is only appropriate when cluster structure is present. As such, the study of clusterability, which evaluates whether data possesses such structure, is an integral part of cluster analysis. However, methods for evaluating clusterability vary radically, making it challenging to select a suitable measure. In this paper, we perform an extensive comparison of measures of clusterability and provide guidelines that clustering users can reference to select suitable measures for their applications.

0
3
下载
预览

As a new classification platform, deep learning has recently received increasing attention from researchers and has been successfully applied to many domains. In some domains, like bioinformatics and robotics, it is very difficult to construct a large-scale well-annotated dataset due to the expense of data acquisition and costly annotation, which limits its development. Transfer learning relaxes the hypothesis that the training data must be independent and identically distributed (i.i.d.) with the test data, which motivates us to use transfer learning to solve the problem of insufficient training data. This survey focuses on reviewing the current researches of transfer learning by using deep neural network and its applications. We defined deep transfer learning, category and review the recent research works based on the techniques used in deep transfer learning.

0
10
下载
预览

Learning compact binary codes for image retrieval problem using deep neural networks has attracted increasing attention recently. However, training deep hashing networks is challenging due to the binary constraints on the hash codes, the similarity preserving property, and the requirement for a vast amount of labelled images. To the best of our knowledge, none of the existing methods has tackled all of these challenges completely in a unified framework. In this work, we propose a novel end-to-end deep hashing approach, which is trained to produce binary codes directly from image pixels without the need of manual annotation. In particular, we propose a novel pairwise binary constrained loss function, which simultaneously encodes the distances between pairs of hash codes, and the binary quantization error. In order to train the network with the proposed loss function, we also propose an efficient parameter learning algorithm. In addition, to provide similar/dissimilar training images to train the network, we exploit 3D models reconstructed from unlabelled images for automatic generation of enormous similar/dissimilar pairs. Extensive experiments on three image retrieval benchmark datasets demonstrate the superior performance of the proposed method over the state-of-the-art hashing methods on the image retrieval problem.

0
3
下载
预览

We present a new method that learns to segment and cluster images without labels of any kind. A simple loss based on information theory is used to extract meaningful representations directly from raw images. This is achieved by maximising mutual information of images known to be related by spatial proximity or randomized transformations, which distills their shared abstract content. Unlike much of the work in unsupervised deep learning, our learned function outputs segmentation heatmaps and discrete classifications labels directly, rather than embeddings that need further processing to be usable. The loss can be formulated as a convolution, making it the first end-to-end unsupervised learning method that learns densely and efficiently for semantic segmentation. Implemented using realistic settings on generic deep neural network architectures, our method attains superior performance on COCO-Stuff and ISPRS-Potsdam for segmentation and STL for clustering, beating state-of-the-art baselines.

0
4
下载
预览

A major goal of unsupervised learning is to discover data representations that are useful for subsequent tasks, without access to supervised labels during training. Typically, this goal is approached by minimizing a surrogate objective, such as the negative log likelihood of a generative model, with the hope that representations useful for subsequent tasks will arise incidentally. In this work, we propose instead to directly target a later desired task by meta-learning an unsupervised learning rule, which leads to representations useful for that task. Here, our desired task (meta-objective) is the performance of the representation on semi-supervised classification, and we meta-learn an algorithm -- an unsupervised weight update rule -- that produces representations that perform well under this meta-objective. Additionally, we constrain our unsupervised update rule to a be a biologically-motivated, neuron-local function, which enables it to generalize to novel neural network architectures. We show that the meta-learned update rule produces useful features and sometimes outperforms existing unsupervised learning techniques. We further show that the meta-learned unsupervised update rule generalizes to train networks with different widths, depths, and nonlinearities. It also generalizes to train on data with randomly permuted input dimensions and even generalizes from image datasets to a text task.

0
6
下载
预览

Classifying large scale networks into several categories and distinguishing them according to their fine structures is of great importance with several applications in real life. However, most studies of complex networks focus on properties of a single network but seldom on classification, clustering, and comparison between different networks, in which the network is treated as a whole. Due to the non-Euclidean properties of the data, conventional methods can hardly be applied on networks directly. In this paper, we propose a novel framework of complex network classifier (CNC) by integrating network embedding and convolutional neural network to tackle the problem of network classification. By training the classifiers on synthetic complex network data and real international trade network data, we show CNC can not only classify networks in a high accuracy and robustness, it can also extract the features of the networks automatically.

0
5
下载
预览

In the same vein of discriminative one-shot learning, Siamese networks allow recognizing an object from a single exemplar with the same class label. However, they do not take advantage of the underlying structure of the data and the relationship among the multitude of samples as they only rely on pairs of instances for training. In this paper, we propose a new quadruplet deep network to examine the potential connections among the training instances, aiming to achieve a more powerful representation. We design four shared networks that receive multi-tuple of instances as inputs and are connected by a novel loss function consisting of pair-loss and triplet-loss. According to the similarity metric, we select the most similar and the most dissimilar instances as the positive and negative inputs of triplet loss from each multi-tuple. We show that this scheme improves the training performance. Furthermore, we introduce a new weight layer to automatically select suitable combination weights, which will avoid the conflict between triplet and pair loss leading to worse performance. We evaluate our quadruplet framework by model-free tracking-by-detection of objects from a single initial exemplar in several Visual Object Tracking benchmarks. Our extensive experimental analysis demonstrates that our tracker achieves superior performance with a real-time processing speed of 78 frames-per-second (fps).

0
9
下载
预览
小贴士
相关资讯
无监督元学习表示学习
CreateAMind
20+阅读 · 2019年1月4日
Unsupervised Learning via Meta-Learning
CreateAMind
29+阅读 · 2019年1月3日
我中心3篇长文被ACL 2018录用
哈工大SCIR
4+阅读 · 2018年4月24日
随波逐流:Similarity-Adaptive and Discrete Optimization
我爱读PAMI
4+阅读 · 2018年2月6日
【推荐】自然语言处理(NLP)指南
机器学习研究会
33+阅读 · 2017年11月17日
【推荐】MXNet深度情感分析实战
机器学习研究会
16+阅读 · 2017年10月4日
相关VIP内容
相关论文
Meta-Learning to Cluster
Yibo Jiang,Nakul Verma
14+阅读 · 2019年10月30日
Wei-Lin Chiang,Xuanqing Liu,Si Si,Yang Li,Samy Bengio,Cho-Jui Hsieh
9+阅读 · 2019年8月8日
An end-to-end Neural Network Framework for Text Clustering
Jie Zhou,Xingyi Cheng,Jinchao Zhang
6+阅读 · 2019年3月22日
To Cluster, or Not to Cluster: An Analysis of Clusterability Methods
A. Adolfsson,M. Ackerman,N. C. Brownstein
3+阅读 · 2018年8月24日
A Survey on Deep Transfer Learning
Chuanqi Tan,Fuchun Sun,Tao Kong,Wenchang Zhang,Chao Yang,Chunfang Liu
10+阅读 · 2018年8月6日
Binary Constrained Deep Hashing Network for Image Retrieval without Manual Annotation
Thanh-Toan Do,Khoa Le,Trung Pham,Tuan Hoang,Huu Le,Ngai-Man Cheung,Ian Reid
3+阅读 · 2018年8月2日
Invariant Information Distillation for Unsupervised Image Segmentation and Clustering
Xu Ji,João F. Henriques,Andrea Vedaldi
4+阅读 · 2018年7月21日
Luke Metz,Niru Maheswaranathan,Brian Cheung,Jascha Sohl-Dickstein
6+阅读 · 2018年5月23日
Ruyue Xin,Jiang Zhang,Yitong Shao
5+阅读 · 2018年4月8日
Xingping Dong,Jianbing Shen,Yu Liu,Wenguan Wang,Fatih Porikli
9+阅读 · 2018年3月17日
Top