We propose a novel formulation of the triplet objective function that improves metric learning without additional sample mining or overhead costs. Our approach aims to explicitly regularize the distance between the positive and negative samples in a triplet with respect to the anchor-negative distance. As an initial validation, we show that our method (called No Pairs Left Behind [NPLB]) improves upon the traditional and current state-of-the-art triplet objective formulations on standard benchmark datasets. To show the effectiveness and potentials of NPLB on real-world complex data, we evaluate our approach on a large-scale healthcare dataset (UK Biobank), demonstrating that the embeddings learned by our model significantly outperform all other current representations on tested downstream tasks. Additionally, we provide a new model-agnostic single-time health risk definition that, when used in tandem with the learned representations, achieves the most accurate prediction of subjects' future health complications. Our results indicate that NPLB is a simple, yet effective framework for improving existing deep metric learning models, showcasing the potential implications of metric learning in more complex applications, especially in the biological and healthcare domains.
翻译:我们提出三重目标功能的新提法,在不增加采样或间接费用的情况下改进衡量学习,不增加采样或间接费用。我们的方法旨在明确规范在锚阴距离上三重的正负抽样之间的距离。作为初步验证,我们展示了我们的方法(称为“无对左后方” )改进了标准基准数据集的传统和目前最先进的三重目标配方。为了显示NPLB在现实世界复杂数据上的有效性和潜力,我们评估了我们在大规模保健数据集(UK Biobank)上的做法,表明我们模型所学的嵌入大大超越了所有其他在测试的下游任务上的当前表现。此外,我们提供了一个新的模型-不可忽略的单一时间健康风险定义,在与所学的表述同时使用时,可以对各主题的未来健康并发症作出最准确的预测。我们的结果表明,NPLB是一个简单、但有效的框架,用于改进现有的深层次的计量学习模型,显示在更复杂的应用中,特别是在生物和保健领域,衡量指标学习的潜在影响。