Seeking high-quality representations with latent variable models (LVMs) to reveal the intrinsic correlation between neural activity and behavior or sensory stimuli has attracted much interest. Most work has focused on analyzing motor neural activity that controls clear behavioral traces and has modeled neural temporal relationships in a way that does not conform to natural reality. For studies of visual brain regions, naturalistic visual stimuli are high-dimensional and time-dependent, making neural activity exhibit intricate dynamics. To cope with such conditions, we propose Time-Dependent Split VAE (TiDeSPL-VAE), a sequential LVM that decomposes visual neural activity into two latent representations while considering time dependence. We specify content latent representations corresponding to the component of neural activity driven by the current visual stimulus, and style latent representations corresponding to the neural dynamics influenced by the organism's internal state. To progressively generate the two latent representations over time, we introduce state factors to construct conditional distributions with time dependence and apply self-supervised contrastive learning to shape them. By this means, TiDeSPL-VAE can effectively analyze complex visual neural activity and model temporal relationships in a natural way. We compare our model with alternative approaches on synthetic data and neural data from the mouse visual cortex. The results show that our model not only yields the best decoding performance on naturalistic scenes/movies but also extracts explicit neural dynamics, demonstrating that it builds latent representations more relevant to visual stimuli.
翻译:暂无翻译