Automatic identification of plant specimens from amateur photographs could improve species range maps, thus supporting ecosystems research as well as conservation efforts. However, classifying plant specimens based on image data alone is challenging: some species exhibit large variations in visual appearance, while at the same time different species are often visually similar; additionally, species observations follow a highly imbalanced, long-tailed distribution due to differences in abundance as well as observer biases. On the other hand, most species observations are accompanied by side information about the spatial, temporal and ecological context. Moreover, biological species are not an unordered list of classes but embedded in a hierarchical taxonomic structure. We propose a multimodal deep learning model that takes into account these additional cues in a unified framework. Our Digital Taxonomist is able to identify plant species in photographs better than a classifier trained on the image content alone, the performance gained is over 6 percent points in terms of accuracy.
翻译:从业余照片中自动确定植物标本可以改进物种分布图,从而支持生态系统研究和养护工作。然而,仅根据图像数据对植物标本进行分类具有挑战性:某些物种在视觉外观上存在很大的差异,而同时不同的物种往往在视觉上相似;此外,由于丰度和观察者偏见的不同,物种观测的分布高度不平衡、长尾不齐;另一方面,大多数物种观测都伴有关于空间、时间和生态背景的附带信息;此外,生物物种并非未经排序的类别清单,而是包含在等级分类结构中。我们提出了一个多式深层次学习模型,在统一的框架内考虑到这些额外的提示。我们的数字分类学家能够发现照片中的植物物种比仅仅经过关于图像内容培训的分类师更好,其性能在准确性方面超过6%。