Camera traps are a proven tool in biology and specifically biodiversity research. However, camera traps including depth estimation are not widely deployed, despite providing valuable context about the scene and facilitating the automation of previously laborious manual ecological methods. In this study, we propose an automated camera trap-based approach to detect and identify animals using depth estimation. To detect and identify individual animals, we propose a novel method D-Mask R-CNN for the so-called instance segmentation which is a deep learning-based technique to detect and delineate each distinct object of interest appearing in an image or a video clip. An experimental evaluation shows the benefit of the additional depth estimation in terms of improved average precision scores of the animal detection compared to the standard approach that relies just on the image information. This novel approach was also evaluated in terms of a proof-of-concept in a zoo scenario using an RGB-D camera trap.
翻译:然而,尽管提供了有价值的现场背景,并促进了以往艰苦的人工生态方法的自动化,但摄影机陷阱(包括深度估计)尚未广泛使用。在本研究中,我们提议采用自动照相机陷阱法,利用深度估计来探测和识别动物。为了探测和识别个别动物,我们提议了一种新型方法D-Mask R-CNN,用于所谓的实例分割,这是一种深层次的基于学习的技术,用以探测和划定图像或视频剪辑中出现的每个不同感兴趣对象。一项实验性评估显示,与仅依赖图像信息的标准方法相比,动物探测的平均精确分数提高了,从而增加了深度估计的效益。还用RGB-D相机陷阱来评估了这种新颖方法在动物园区情景中的验证概念。