Real-world behavior is often shaped by complex interactions between multiple agents. To scalably study multi-agent behavior, advances in unsupervised and self-supervised learning have enabled a variety of different behavioral representations to be learned from trajectory data. To date, there does not exist a unified set of benchmarks that can enable comparing methods quantitatively and systematically across a broad set of behavior analysis settings. We aim to address this by introducing a large-scale, multi-agent trajectory dataset from real-world behavioral neuroscience experiments that covers a range of behavior analysis tasks. Our dataset consists of trajectory data from common model organisms, with 9.6 million frames of mouse data and 4.4 million frames of fly data, in a variety of experimental settings, such as different strains, lengths of interaction, and optogenetic stimulation. A subset of the frames also consist of expert-annotated behavior labels. Improvements on our dataset corresponds to behavioral representations that work across multiple organisms and is able to capture differences for common behavior analysis tasks.
翻译:现实世界的行为往往是由多种行为主体之间复杂的相互作用决定的。为了对多剂行为进行精确的研究,未经监督和自我监督的学习的进步使得从轨迹数据中可以了解各种不同的行为表现方式。迄今为止,还没有一套统一的基准,能够在一系列广泛的行为分析环境中对方法进行定量和系统性的比较。我们的目标是通过引入包含一系列行为分析任务在内的来自现实世界行为神经科学实验的大规模、多剂的轨迹数据集来解决这一问题。我们的数据集由来自普通模型生物的轨迹数据组成,共有960万个鼠标数据框架和440万个飞行数据框架,分布在各种实验环境中,如不同的压力、互动长度和可选基因刺激。框架的一个子还包括专家附加说明的行为标签。我们数据集的改进与跨多个生物体并能够捕捉到共同行为分析任务差异的行为表现相匹配。