Probabilistic topic models are popular unsupervised learning methods, including probabilistic latent semantic indexing (pLSI) and latent Dirichlet allocation (LDA). By now, their training is implemented on general purpose computers (GPCs), which are flexible in programming but energy-consuming. Towards low-energy implementations, this paper investigates their training on an emerging hardware technology called the neuromorphic multi-chip systems (NMSs). NMSs are very effective for a family of algorithms called spiking neural networks (SNNs). We present three SNNs to train topic models. The first SNN is a batch algorithm combining the conventional collapsed Gibbs sampling (CGS) algorithm and an inference SNN to train LDA. The other two SNNs are online algorithms targeting at both energy- and storage-limited environments. The two online algorithms are equivalent with training LDA by using maximum-a-posterior estimation and maximizing the semi-collapsed likelihood, respectively. They use novel, tailored ordinary differential equations for stochastic optimization. We simulate the new algorithms and show that they are comparable with the GPC algorithms, while being suitable for NMS implementation. We also propose an extension to train pLSI and a method to prune the network to obey the limited fan-in of some NMSs.
翻译:概率性专题模型是流行的不受监督的学习方法,包括概率潜潜潜语义索引(pLSI)和潜潜潜狄里赫特分配(LDA)等。现在,它们的培训是在通用计算机(GPCs)上进行的,这些计算机在编程方面是灵活的,但耗能是消耗能源的。在低能执行方面,本文调查了它们关于一种新型硬件技术的培训,称为神经形态多芯系统(NMSs)。NMS对被称为神经神经网络(SNNS)的算法系列来说非常有效。我们提出三个SNNNNNN(SN)来培训主题模型。第一个SNNN(SNN)是一种批量算法,结合常规崩溃的GPS(CGS)算法)和训练LDA的推断。另外两个SNNNN(NN)是针对能源和储存环境的在线算法。两种在线算法相当于LDA培训,它们分别使用最高偏差估计和最大半折叠的可能性。它们使用新定制的普通差异方程方程式来进行主题性优化。我们模拟新的算法,同时提议采用一种可比较的GPC(NPC)网络,同时提议采用一种可比较的LMS(GPC)和PRPR(W),我们提议一种可操作法。