Averaging predictions over a set of models -- an ensemble -- is widely used to improve predictive performance and uncertainty estimation of deep learning models. At the same time, many machine learning systems, such as search, matching, and recommendation systems, heavily rely on embeddings. Unfortunately, due to misalignment of features of independently trained models, embeddings, cannot be improved with a naive deep ensemble like approach. In this work, we look at the ensembling of representations and propose mean embeddings with test-time augmentation (MeTTA) simple yet well-performing recipe for ensembling representations. Empirically we demonstrate that MeTTA significantly boosts the quality of linear evaluation on ImageNet for both supervised and self-supervised models. Even more exciting, we draw connections between MeTTA, image retrieval, and transformation invariant models. We believe that spreading the success of ensembles to inference higher-quality representations is the important step that will open many new applications of ensembling.
翻译:对一套模型 -- -- 共同式 -- -- 的预测广泛用于改进预测性表现和对深层学习模型的不确定性估计。与此同时,许多机器学习系统,例如搜索、匹配和建议系统,严重依赖嵌入。不幸的是,由于独立培训模型、嵌入模型的特征不匹配,因此无法用天真的深厚的共体等方法改进。在这项工作中,我们审视各种表达方式的组合,并提出与测试-时间增强(METTA)相融合的简单而完善的配方的暗中嵌入。我们生动地表明,MTTA大大提升了受监督和自我监督的模型对图像网络的线性评价质量。甚至更令人兴奋的是,我们把MTTA、图像检索和变异模型联系起来。我们认为,将组合的成功推广到推导出更高质量的表达方式,是打开许多新组合应用的重要一步。