In this paper, we propose a general framework for image classification using the attention mechanism and global context, which could incorporate with various network architectures to improve their performance. To investigate the capability of the global context, we compare four mathematical models and observe the global context encoded in the category disentangled conditional generative model could give more guidance as "know what is task irrelevant will also know what is relevant". Based on this observation, we define a novel Category Disentangled Global Context (CDGC) and devise a deep network to obtain it. By attending CDGC, the baseline networks could identify the objects of interest more accurately, thus improving the performance. We apply the framework to many different network architectures and compare with the state-of-the-art on four publicly available datasets. Extensive results validate the effectiveness and superiority of our approach. Code will be made public upon paper acceptance.
翻译:在本文中,我们提出了一个利用关注机制和全球背景进行图像分类的一般框架,该框架可以与各种网络结构结合,以提高其性能。为了调查全球背景的能力,我们比较了四个数学模型,并观察了在分解的有条件基因模型类别中编码的全球背景,这样可以提供更多的指导,“知道什么是无关的任务也会知道什么是相关的。” 基于这一观察,我们定义了一个新型的分解全球背景(CDGC)类别,并设计了一个深厚的网络来获取它。通过参加CDGC,基准网络可以更准确地确定感兴趣的对象,从而改进性能。我们将该框架应用于许多不同的网络结构,并在四套公开的数据集上与最新技术进行比较。广泛的结果证实了我们的方法的有效性和优越性。一旦被接受,将公布守则。