The natural world is long-tailed: rare classes are observed orders of magnitudes less frequently than common ones, leading to highly-imbalanced data where rare classes can have only handfuls of examples. Learning from few examples is a known challenge for deep learning based classification algorithms, and is the focus of the field of low-shot learning. One potential approach to increase the training data for these rare classes is to augment the limited real data with synthetic samples. This has been shown to help, but the domain shift between real and synthetic hinders the approaches' efficacy when tested on real data. We explore the use of image-to-image translation methods to close the domain gap between synthetic and real imagery for animal species classification in data collected from camera traps: motion-activated static cameras used to monitor wildlife. We use low-level feature alignment between source and target domains to make synthetic data for a rare species generated using a graphics engine more "realistic". Compared against a system augmented with unaligned synthetic data, our experiments show a considerable decrease in classification error rates on a rare species.
翻译:自然界由来已久:观察的稀有类别数量数量数量数量比一般类别少,导致高度均衡的数据,稀有类别只能有少数几个例子。从几个例子中学习是深学习基于分类算法的已知挑战,也是低发学习领域的重点。增加这些稀有类别培训数据的一个潜在办法是用合成样品来增加有限的真实数据。这已证明是有益的,但是在用真实数据进行测试时,真实和合成之间的领域变化妨碍了方法的效力。我们探索如何使用图像到图像的翻译方法来缩小从相机陷阱收集的数据中合成和真实的动物物种分类的域间差距:用于监测野生动物的移动激活静态照相机。我们使用源和目标领域之间的低位特征调整方法,使利用图形引擎生成的稀有物种的合成数据更加“现实化”。与用不匹配的合成数据增强的系统相比,我们的实验显示稀有物种的分类误率显著下降。