二进制数据结构探测概率潜潜伏变量模型 (A probabilistic latent variable model for detecting structure in binary data)

We introduce a novel, probabilistic binary latent variable model to detect noisy or approximate repeats of patterns in sparse binary data. The model is based on the "Noisy-OR model" (Heckerman, 1990), used previously for disease and topic modelling. The model's capability is demonstrated by extracting structure in recordings from retinal neurons, but it can be widely applied to discover and model latent structure in noisy binary data. In the context of spiking neural data, the task is to "explain" spikes of individual neurons in terms of groups of neurons, "Cell Assemblies" (CAs), that often fire together, due to mutual interactions or other causes. The model infers sparse activity in a set of binary latent variables, each describing the activity of a cell assembly. When the latent variable of a cell assembly is active, it reduces the probabilities of neurons belonging to this assembly to be inactive. The conditional probability kernels of the latent components are learned from the data in an expectation maximization scheme, involving inference of latent states and parameter adjustments to the model. We thoroughly validate the model on synthesized spike trains constructed to statistically resemble recorded retinal responses to white noise stimulus and natural movie stimulus in data. We also apply our model to spiking responses recorded in retinal ganglion cells (RGCs) during stimulation with a movie and discuss the found structure.

翻译：我们引入了一个新颖的、概率的二元潜伏变量模型, 以探测稀少的二元数据中各种模式的杂乱或近似重复; 该模型以先前用于疾病和主题建模的“ 诺伊- OR 模型”(Heckerman, 1990)为基础, 模型的能力通过从视网膜神经元的录音中提取结构来证明。但是, 它可以广泛用于在噪音的二元数据中发现和建模潜伏结构。在神经元数据中, 任务在于从神经元组群中“ 解析” 单个神经元的“ 峰值” 峰值, 通常由于相互互动或其他原因一起燃烧。模型推算出一组二元潜伏变量中的稀释活动, 每一个都描述了细胞组群的活动。当细胞组的潜伏变量活跃时, 它可以降低属于这个组群的神经元的概率。在预期最大化计划中, 潜在细胞的模型核心部分从数据中学习, 包括潜伏状态和参数调整到模型中, 我们用统计性螺旋调整模型对模型进行模拟刺激反应。