Learning from a continuous stream of non-stationary data in an unsupervised manner is arguably one of the most common and most challenging settings facing intelligent agents. Here, we attack learning under all three conditions (unsupervised, streaming, non-stationary) in the context of clustering, also known as mixture modeling. We introduce a novel clustering algorithm that endows mixture models with the ability to create new clusters online, as demanded by the data, in a probabilistic, time-varying, and principled manner. To achieve this, we first define a novel stochastic process called the Dynamical Chinese Restaurant Process (Dynamical CRP), which is a non-exchangeable distribution over partitions of a set; next, we show that the Dynamical CRP provides a non-stationary prior over cluster assignments and yields an efficient streaming variational inference algorithm. We conclude with experiments showing that the Dynamical CRP can be applied on diverse synthetic and real data with Gaussian and non-Gaussian likelihoods.
翻译:以不受监督的方式从非静止数据连续流中学习,可以说是智能剂面临的最常见和最具挑战性的环境之一。在这里,我们攻击在集群(也称为混合模型)中的所有三种条件下(无人监督、流流、非静止)的学习。我们引入了一种新型的集群算法,这种算法使混合模型能够按照数据的要求,以概率、时间和原则的方式在网上创建新的集群。为了实现这一点,我们首先定义了一个叫作动态中国餐厅进程(动态中国餐厅进程)的新颖的随机过程,这是一个无法交换的分布于一组食物分割之上的过程;接下来,我们表明动态餐厅CRP提供了非静止的前集群任务,并产生了一种高效流动变异推算法。我们通过实验得出结论,该动态CRP可以应用到多种合成和真实数据上,有高斯文和非撒文的可能性。