Many modern time-series datasets contain large numbers of output response variables sampled for prolonged periods of time. For example, in neuroscience, the activities of 100s-1000's of neurons are recorded during behaviors and in response to sensory stimuli. Multi-output Gaussian process models leverage the nonparametric nature of Gaussian processes to capture structure across multiple outputs. However, this class of models typically assumes that the correlations between the output response variables are invariant in the input space. Stochastic linear mixing models (SLMM) assume the mixture coefficients depend on input, making them more flexible and effective to capture complex output dependence. However, currently, the inference for SLMMs is intractable for large datasets, making them inapplicable to several modern time-series problems. In this paper, we propose a new regression framework, the orthogonal stochastic linear mixing model (OSLMM) that introduces an orthogonal constraint amongst the mixing coefficients. This constraint reduces the computational burden of inference while retaining the capability to handle complex output dependence. We provide Markov chain Monte Carlo inference procedures for both SLMM and OSLMM and demonstrate superior model scalability and reduced prediction error of OSLMM compared with state-of-the-art methods on several real-world applications. In neurophysiology recordings, we use the inferred latent functions for compact visualization of population responses to auditory stimuli, and demonstrate superior results compared to a competing method (GPFA). Together, these results demonstrate that OSLMM will be useful for the analysis of diverse, large-scale time-series datasets.
翻译:许多现代时间序列数据集包含大量长期抽样的输出响应变量。 例如,在神经科学中,100-1000个神经元的活动记录在行为过程中和对感官刺激的反应中。 多输出高斯进程模型利用高斯进程的非参数性质来捕捉多种产出的结构。 但是,这一类模型通常假设产出响应变量之间的相互关系在输入空间中是变化不定的。 随机线性线性混合模型(SLMM)假设混合物系数依赖于投入,使其更灵活和更有效地捕捉复杂的神经依赖性。然而,目前,对大型数据集而言,SLMMM的推论是棘手的,使之不适用于若干现代的时间序列问题。在本文中,我们提出了一个新的回归框架,即振动的振动性线性线性线性混合模型(OSLMMM),在混合系数中引入一个或可测的制约。这种制约将减少计算性系数的负担,同时保留处理复杂产出依赖性稳定的OLMMLM的计算能力。 我们用SLMLM的高级数据序列和直径性模型中, 展示了多个SLMLMLM的精确性数据的精确性, 。