Automated analysis of electron microscopy datasets poses multiple challenges, such as limitation in the size of the training dataset, variation in data distribution induced by variation in sample quality and experiment conditions, etc. It is crucial for the trained model to continue to provide acceptable segmentation/classification performance on new data, and quantify the uncertainty associated with its predictions. Among the broad applications of machine learning, various approaches have been adopted to quantify uncertainty, such as Bayesian modeling, Monte Carlo dropout, ensembles, etc. With the aim of addressing the challenges specific to the data domain of electron microscopy, two different types of ensembles of pre-trained neural networks were implemented in this work. The ensembles performed semantic segmentation of ice crystal within a two-phase mixture, thereby tracking its phase transformation to water. The first ensemble (EA) is composed of U-net style networks having different underlying architectures, whereas the second series of ensembles (ER-i) are composed of randomly initialized U-net style networks, wherein each base learner has the same underlying architecture 'i'. The encoders of the base learners were pre-trained on the Imagenet dataset. The performance of EA and ER were evaluated on three different metrics: accuracy, calibration, and uncertainty. It is seen that EA exhibits a greater classification accuracy and is better calibrated, as compared to ER. While the uncertainty quantification of these two types of ensembles are comparable, the uncertainty scores exhibited by ER were found to be dependent on the specific architecture of its base member ('i') and not consistently better than EA. Thus, the challenges posed for the analysis of electron microscopy datasets appear to be better addressed by an ensemble design like EA, as compared to an ensemble design like ER.
翻译:电子显微镜数据集的自动分析带来了多种挑战,例如培训数据集的规模有限,由于样本质量和实验条件的变化,数据分布因抽样质量和试验条件的变化而发生变化。 至关重要的是,经过培训的模式要继续提供可接受的新数据分解/分类性能,并量化其预测中的不确定性。 在机器学习的广泛应用中,采用了多种方法来量化不确定性,例如贝叶西亚模型、蒙特卡洛退出、组合等。 为了应对电子显微镜数据领域特有的挑战,两种不同的经过事先训练的神经网络的不确定性类型。对于经过训练的模型来说,至关重要的是继续提供可接受的新数据分解/分类性能,从而跟踪其向水的阶段转换。在机器学习的广泛应用中,采用了多种方法将不确定性量化为具有不同基本结构的U-net风格网络,而第二组(ER-i)则由随机初始化的U-net风格网络构成,其中每个基础学习者具有两种不同类型的不确定性。 在一个基础学习者中,对EARC的精确度进行了更精确性分析,这些精确性结构的精度比是用来分析。 在E-RA的模型中,这些具体的精确性结构中, 的精确性分析是用来分析。 。 在E-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I