Deep learning algorithms face great challenges with long-tailed data distribution which, however, is quite a common case in real-world scenarios. Previous methods tackle the problem from either the aspect of input space (re-sampling classes with different frequencies) or loss space (re-weighting classes with different weights), suffering from heavy over-fitting to tail classes or hard optimization during training. To alleviate these issues, we propose a more fundamental perspective for long-tailed recognition, {\it i.e.}, from the aspect of parameter space, and aims to preserve specific capacity for classes with low frequencies. From this perspective, the trivial solution utilizes different branches for the head, medium, tail classes respectively, and then sums their outputs as the final results is not feasible. Instead, we design the effective residual fusion mechanism -- with one main branch optimized to recognize images from all classes, another two residual branches are gradually fused and optimized to enhance images from medium+tail classes and tail classes respectively. Then the branches are aggregated into final results by additive shortcuts. We test our method on several benchmarks, {\it i.e.}, long-tailed version of CIFAR-10, CIFAR-100, Places, ImageNet, and iNaturalist 2018. Experimental results manifest that our method achieves new state-of-the-art for long-tailed recognition. Code will be available at \url{https://github.com/FPNAS/ResLT}.
翻译:深层学习算法面临长尾数据分布的巨大挑战,然而,长尾数据分布在现实世界情景中相当常见。以往的方法处理的问题有:输入空间(重样类,频率不同)或丢失空间(重量类,重量不同),对尾尾类过于适应,或在培训期间难以优化。为了缓解这些问题,我们从参数空间的角度提出了长尾识别的更根本的视角,目的是为低频率班级保留特定能力。从这个角度出发,微不足道的解决方案分别使用头部、中小类、尾类的不同分支,然后将其产出计算为最后结果不可行。相反,我们设计了有效的残留融合机制 -- -- 由一个主要分支进行优化,以识别所有班级的图像。另外两个残余分支将逐渐结合和优化,以便分别加强中尾类和尾类的图像。然后通过添加快捷键将这些分支汇总为最终结果。我们用几种基准测试了我们的方法, i. i. e., 尾类, 然后将它们分别使用它们作为最终结果。 NARFAR- sadlivelilled the real livelilal livelilal fal ligal ligal exal exal real exal ligal sal sal sal sal sal.