Learning compact and interpretable representations of data is a critical challenge in scientific image analysis. Here, we introduce Affinity-VAE, a generative model that enables us to impose our scientific intuition about the similarity of instances in the dataset on the learned representation during training. We demonstrate the utility of the approach in the scientific domain of cryo-electron tomography (cryo-ET) where a significant current challenge is to identify similar molecules within a noisy and low contrast tomographic image volume. This task is distinct from classification in that, at inference time, it is unknown whether an instance is part of the training set or not. We trained affinity-VAE using prior knowledge of protein structure to inform the latent space. Our model is able to create rotationally-invariant, morphologically homogeneous clusters in the latent representation, with improved cluster separation compared to other approaches. It achieves competitive performance on protein classification with the added benefit of disentangling object pose, structural similarity and an interpretable latent representation. In the context of cryo-ET data, affinity-VAE captures the orientation of identified proteins in 3D which can be used as a prior for subsequent scientific experiments. Extracting physical principles from a trained network is of significant importance in scientific imaging where a ground truth training set is not always feasible.
翻译:暂无翻译