Deep neural networks have been the driving force behind the success in classification tasks, e.g., object and audio recognition. Impressive results and generalization have been achieved by a variety of recently proposed architectures, the majority of which are seemingly disconnected. In this work, we cast the study of deep classifiers under a unifying framework. In particular, we express state-of-the-art architectures (e.g., residual and non-local networks) in the form of different degree polynomials of the input. Our framework provides insights on the inductive biases of each model and enables natural extensions building upon their polynomial nature. The efficacy of the proposed models is evaluated on standard image and audio classification benchmarks. The expressivity of the proposed models is highlighted both in terms of increased model performance as well as model compression. Lastly, the extensions allowed by this taxonomy showcase benefits in the presence of limited data and long-tailed data distributions. We expect this taxonomy to provide links between existing domain-specific architectures. The source code is available at \url{https://github.com/grigorisg9gr/polynomials-for-augmenting-NNs}.
翻译:深心神经网络一直是分类任务成功背后的动力,例如物体和声频识别。最近提出的各种结构已经取得了令人印象深刻的成果和概括化,其中多数似乎似乎互不相干。在这项工作中,我们在统一的框架内对深分层者进行了研究。特别是,我们以投入中不同程度的多元数据形式表达最先进的结构(例如剩余和非本地网络),我们的框架提供了对每种模型的诱导偏向的洞察力,并使得以其多元性为基础进行自然扩展。拟议模型的效力在标准图像和音频分类基准上得到评估。拟议模型的清晰度在提高模型性能和模型压缩方面都得到了强调。最后,该分类学允许的扩展在有限的数据和长尾数据分布中得到了好处。我们期望这种分类能够提供现有特定领域结构之间的联系。源代码可在以下网站查阅:https://github.com/grigorigrisg9/gomolynmentas-foras.com/grogynalmentas-for.