Monitoring biodiversity is paramount to manage and protect natural resources. Collecting images of organisms over large temporal or spatial scales is a promising practice to monitor the biodiversity of natural ecosystems, providing large amounts of data with minimal interference with the environment. Deep learning models are currently used to automate classification of organisms into taxonomic units. However, imprecision in these classifiers introduces a measurement noise that is difficult to control and can significantly hinder the analysis and interpretation of data. {We overcome this limitation through ensembles of Data-efficient image Transformers (DeiTs), which not only are easy to train and implement, but also significantly outperform} the previous state of the art (SOTA). We validate our results on ten ecological imaging datasets of diverse origin, ranging from plankton to birds. On all the datasets, we achieve a new SOTA, with a reduction of the error with respect to the previous SOTA ranging from 29.35% to 100.00%, and often achieving performances very close to perfect classification. Ensembles of DeiTs perform better not because of superior single-model performances but rather due to smaller overlaps in the predictions by independent models and lower top-1 probabilities. This increases the benefit of ensembling, especially when using geometric averages to combine individual learners. While we only test our approach on biodiversity image datasets, our approach is generic and can be applied to any kind of images.
翻译:监测生物多样性对于管理和保护自然资源至关重要。收集大型时间或空间尺度生物体的图像是监测自然生态系统生物多样性的一个很有希望的做法,它提供了大量数据,对环境的干扰最小。目前,深度学习模型用于将生物体自动分类为分类单位。然而,这些分类器的不精确性引入了测量噪音,难以控制,并可能严重阻碍数据的分析和解释。 {我们通过数据高效图像变异器(DeiTs)的组合克服了这一局限性,这些变异器不仅容易培训和实施,而且大大优于以往的艺术状态(SOTA ) 。我们验证了10个不同来源的生态成像数据集的结果,从浮游生物到鸟类。在所有数据集中,我们实现新的SOTA,将前STA的误差从29.35%到100.00%不等,而且往往达到非常接近于完美的分类。 DeiTs的组合方法表现更好,不是因为高级的单一模型表现,而是相当优异的艺术状态 。我们验证了10个不同来源的生态成像数据集的结果。我们只能独立地进行更小的测试。