Recent technological developments have spurred great advances in the computerized tracking of joints and other landmarks in moving animals, including humans. Such tracking promises important advances in biology and biomedicine. Modern tracking models depend critically on labor-intensive annotated datasets of primary landmarks by non-expert humans. However, such annotation approaches can be costly and impractical for secondary landmarks, that is, ones that reflect fine-grained geometry of animals, and that are often specific to customized behavioral tasks. Due to visual and geometric ambiguity, nonexperts are often not qualified for secondary landmark annotation, which can require anatomical and zoological knowledge. These barriers significantly impede downstream behavioral studies because the learned tracking models exhibit limited generalizability. We hypothesize that there exists a shared representation between the primary and secondary landmarks because the range of motion of the secondary landmarks can be approximately spanned by that of the primary landmarks. We present a method to learn this spatial relationship of the primary and secondary landmarks in three dimensional space, which can, in turn, self-supervise the secondary landmark detector. This 3D representation learning is generic, and can therefore be applied to various multiview settings across diverse organisms, including macaques, flies, and humans.
翻译:最近技术的发展刺激了对包括人类在内的动物移动中的联合和其他里程碑进行计算机化跟踪方面的重大进展。这种跟踪在生物学和生物医学方面有望取得重要进展。现代跟踪模型关键依赖非专家人类主要里程碑的劳动密集型附加说明数据集。然而,这种批注方法对于二级里程碑,即反映细微动物几何特征的二级里程碑来说可能成本高且不切实际,而且往往与定制行为任务有关。由于视觉和几何模糊性,非专家往往不具备二级里程碑说明资格,而这可能需要解剖学和动物学知识。这些障碍严重阻碍了下游行为研究,因为所学的跟踪模型显示的通用性有限。我们假设,初级和二级里程碑之间存在着共同的代表性,因为二级里程碑的运动范围可能因主要里程碑的范围而大致宽广。我们提出了一个方法来了解三维空间中初级和二级里程碑的这种空间关系,这种空间可以反过来自我超越二级里程碑探测器,这可能需要解剖和动物学知识。这些障碍极大地阻碍下游活动研究,因为所学的跟踪模型表明,包括多种飞行,因此,这种结构是通用和多种飞行。