Deep neural networks have been the driving force behind the success in classification tasks, e.g., object and audio recognition. Impressive results and generalization have been achieved by a variety of recently proposed architectures, the majority of which are seemingly disconnected. In this work, we cast the study of deep classifiers under a unifying framework. In particular, we express state-of-the-art architectures (e.g., residual and non-local networks) in the form of different degree polynomials of the input. Our framework provides insights on the inductive biases of each model and enables natural extensions building upon their polynomial nature. The efficacy of the proposed models is evaluated on standard image and audio classification benchmarks. The expressivity of the proposed models is highlighted both in terms of increased model performance as well as model compression. Lastly, the extensions allowed by this taxonomy showcase benefits in the presence of limited data and long-tailed data distributions. We expect this taxonomy to provide links between existing domain-specific architectures.
翻译:深神经网络是分类任务成功背后的动力,例如对象和音频识别。最近提出的各种结构已经取得了令人印象深刻的成果和概括化,其中多数似乎互不相干。在这项工作中,我们将深分类者的研究置于一个统一框架之下。特别是,我们以投入不同程度的多元性形式表达最先进的结构(例如剩余和非本地网络),我们的框架为每种模型的进化偏差提供了洞察力,并使得以其多元性为基础进行自然扩展。拟议模型的功效根据标准图像和音频分类基准进行评估。拟议模型的清晰度在提高模型性能和模型压缩方面都得到了强调。最后,该分类学允许的扩展以有限数据和长尾数据分布方式展示效益。我们期望该分类法能够提供现有特定领域结构之间的联系。