通过Markov链条聚合集成的半超集集成 (Semi-Supervised Clustering via Markov Chain Aggregation)

We connect the problem of semi-supervised clustering to constrained Markov aggregation, i.e., the task of partitioning the state space of a Markov chain. We achieve this connection by considering every data point in the dataset as an element of the Markov chain's state space, by defining the transition probabilities between states via similarities between corresponding data points, and by incorporating semi-supervision information as hard constraints in a Hartigan-style algorithm. The introduced Constrained Markov Clustering (CoMaC) is an extension of a recent information-theoretic framework for (unsupervised) Markov aggregation to the semi-supervised case. Instantiating CoMaC for certain parameter settings further generalizes two previous information-theoretic objectives for unsupervised clustering. Our results indicate that CoMaC is competitive with the state-of-the-art. Furthermore, our approach is less sensitive to hyperparameter settings than the unsupervised counterpart, which is especially attractive in the semi-supervised setting characterized by little labeled data.

翻译：我们把半监督的集群问题与限制的Markov聚合联系起来,即将马尔科夫链条的国家空间分割任务。我们通过将数据集中的每个数据点作为Markov链条国家空间的一个要素来实现这一联系,通过相应的数据点之间的相似点来界定各州之间的过渡概率,并将半监督信息作为硬性限制纳入Hartigan式算法。引入的 Constractive Markov 集群(CoMaC)是(不受监督的)Markov 集合的最新信息理论框架的延伸,而这是半监督案例的延伸。为某些参数设置而强化的COMaC进一步概括了先前两个未监督的集群的信息理论目标。我们的结果表明,CoMaC与State-art相比具有竞争力。此外,我们的方法对超参数设置的敏感度比未监督的对应方要低,在以小标签数据为特征的半监督环境中,这种环境特别有吸引力。

相关内容

马尔可夫链

关注 289

马尔可夫链，因安德烈·马尔可夫（A.A.Markov，1856－1922）得名，是指数学中具有马尔可夫性质的离散事件随机过程。该过程中，在给定当前知识或信息的情况下，过去（即当前以前的历史状态）对于预测将来（即当前以后的未来状态）是无关的。在马尔可夫链的每一步，系统根据概率分布，可以从一个状态变到另一个状态，也可以保持当前状态。状态的改变叫做转移，与不同的状态改变相关的概率叫做转移概率。随机漫步就是马尔可夫链的例子。随机漫步中每一步的状态是在图形中的点，每一步可以移动到任何一个相邻的点，在这里移动到每一个点的概率都是相同的（无论之前漫步路径是如何的）。

《工业互联网平台白皮书 2021》，69页pdf

专知会员服务

44+阅读 · 2022年1月16日

【DeepMind】强化学习教程，83页ppt

专知会员服务

158+阅读 · 2020年8月7日

【Manning新书】现代Java实战，592页pdf

专知会员服务

101+阅读 · 2020年5月22日

【剑桥大学】图网络的主邻域聚合，Principal Neighbourhood Aggregation for Graph Nets

专知会员服务

42+阅读 · 2020年4月22日