Monitoring plankton populations in situ is fundamental to preserve the aquatic ecosystem. Plankton microorganisms are in fact susceptible of minor environmental perturbations, that can reflect into consequent morphological and dynamical modifications. Nowadays, the availability of advanced automatic or semi-automatic acquisition systems has been allowing the production of an increasingly large amount of plankton image data. The adoption of machine learning algorithms to classify such data may be affected by the significant cost of manual annotation, due to both the huge quantity of acquired data and the numerosity of plankton species. To address these challenges, we propose an efficient unsupervised learning pipeline to provide accurate classification of plankton microorganisms. We build a set of image descriptors exploiting a two-step procedure. First, a Variational Autoencoder (VAE) is trained on features extracted by a pre-trained neural network. We then use the learnt latent space as image descriptor for clustering. We compare our method with state-of-the-art unsupervised approaches, where a set of pre-defined hand-crafted features is used for clustering of plankton images. The proposed pipeline outperforms the benchmark algorithms for all the plankton datasets included in our analysis, providing better image embedding properties.
翻译:对浮游生物进行现场监测对于保护水生生态系统至关重要。Plankton微生物事实上容易发生轻微的环境扰动,这可以反映随后的形态和动态改变。现在,先进的自动或半自动获取系统的存在使得能够产生越来越多的浮游生物图像数据。采用机器学习算法对这些数据进行分类可能受到人工批注的巨大成本的影响,因为获得的数据数量巨大,浮游生物物种的数量也很多。为了应对这些挑战,我们建议建立一个高效的、不受监督的学习管道,以提供对浮游微生物的准确分类。我们利用两步程序建立了一套图像描述仪。首先,对通过事先训练的神经网络提取的特征进行了静态自动编码(VAE)培训。然后,我们用所学过的潜伏空间作为图像描述仪。我们比较了我们的方法与最先进的、未经监督的方法,其中使用一套预先定义的手动特征来对浮游微生物微生物进行精确分类。我们为浮游生物图像进行分组而制作的一组图像标定的图解。我们提出的模型将改进了对浮游生物图像进行模型的分析。