Learning from a limited number of samples is challenging since the learned model can easily become overfitted based on the biased distribution formed by only a few training examples. In this paper, we calibrate the distribution of these few-sample classes by transferring statistics from the classes with sufficient examples, then an adequate number of examples can be sampled from the calibrated distribution to expand the inputs to the classifier. We assume every dimension in the feature representation follows a Gaussian distribution so that the mean and the variance of the distribution can borrow from that of similar classes whose statistics are better estimated with an adequate number of samples. Our method can be built on top of off-the-shelf pretrained feature extractors and classification models without extra parameters. We show that a simple logistic regression classifier trained using the features sampled from our calibrated distribution can outperform the state-of-the-art accuracy on two datasets (~5% improvement on miniImageNet compared to the next best). The visualization of these generated features demonstrates that our calibrated distribution is an accurate estimation.
翻译:从数量有限的样本中学习是具有挑战性的,因为根据只有少数培训实例形成的偏差分布,学习的模式很容易被过分夸大。在本文中,我们通过从各个类中转让统计数据并举出足够的实例来校准这些少数样板类的分布。然后可以从校准分布中抽取足够数量的示例,以扩大对分类器的投入。我们假定特征表示的每个层面都遵循高斯分布法,这样,分布的平均值和差异就可以从类似类中借取,这些类中的统计数据通过足够数量的样本得到更好的估计。我们的方法可以建在现成的预先训练的特征提取器和分类模型之上,而没有额外的参数。我们表明,使用校准分布中抽样特征培训的简单物流回归分类器能够超越两个数据集的状态(与下一个最佳数据相比,小型IMageNet的改善幅度为5% ) 。这些生成的特征的直观显示,我们的校准分布是准确的估计。