Understanding animals' behaviors is significant for a wide range of applications. However, existing animal behavior datasets have limitations in multiple aspects, including limited numbers of animal classes, data samples and provided tasks, and also limited variations in environmental conditions and viewpoints. To address these limitations, we create a large and diverse dataset, Animal Kingdom, that provides multiple annotated tasks to enable a more thorough understanding of natural animal behaviors. The wild animal footages used in our dataset record different times of the day in extensive range of environments containing variations in backgrounds, viewpoints, illumination and weather conditions. More specifically, our dataset contains 50 hours of annotated videos to localize relevant animal behavior segments in long videos for the video grounding task, 30K video sequences for the fine-grained multi-label action recognition task, and 33K frames for the pose estimation task, which correspond to a diverse range of animals with 850 species across 6 major animal classes. Such a challenging and comprehensive dataset shall be able to facilitate the community to develop, adapt, and evaluate various types of advanced methods for animal behavior analysis. Moreover, we propose a Collaborative Action Recognition (CARe) model that learns general and specific features for action recognition with unseen new animals. This method achieves promising performance in our experiments. Our dataset can be found at https://sutdcv.github.io/Animal-Kingdom.
翻译:然而,现有的动物行为数据集在多个方面都有局限性,包括动物种类、数据样本和提供的任务数量有限,环境条件和观点的变化也有限。为了应对这些局限性,我们创建了一个庞大和多样化的数据集,即动物王国,提供多种附加说明的任务,以便能够更全面地了解自然动物行为。我们数据集中所使用的野生动物片段记录了各种环境的一天不同时间,这些环境包含背景、观点、光化和天气条件的差异。更具体地说,我们的数据集包含50小时附加说明的视频,用于将相关动物行为部分本地化,用于视频定位任务的长期视频,30K视频序列用于精细的多标签行动识别任务,以及33K框架用于表面估计任务,这与分布在6个主要动物类中有850种物种的多种多样的动物相对应。这样一个具有挑战性和综合性的数据集应能帮助社区开发、调整和评价各种先进的动物行为分析方法。此外,我们提议了一个合作行动识别(CARV) 模型,用于学习我们有希望的模型。