有大型记忆层的CNN (CNN with large memory layers)

This work is centred around the recently proposed product key memory structure \cite{large_memory}, implemented for a number of computer vision applications. The memory structure can be regarded as a simple computation primitive suitable to be augmented to nearly all neural network architectures. The memory block allows implementing sparse access to memory with square root complexity scaling with respect to the memory capacity. The latter scaling is possible due to the incorporation of Cartesian product space decomposition of the key space for the nearest neighbour search. We have tested the memory layer on the classification, image reconstruction and relocalization problems and found that for some of those, the memory layers can provide significant speed/accuracy improvement with the high utilization of the key-value elements, while others require more careful fine-tuning and suffer from dying keys. To tackle the later problem we have introduced a simple technique of memory re-initialization which helps us to eliminate unused key-value pairs from the memory and engage them in training again. We have conducted various experiments and got improvements in speed and accuracy for classification and PoseNet relocalization models. We showed that the re-initialization has a huge impact on a toy example of randomly labeled data and observed some gains in performance on the image classification task. We have also demonstrated the generalization property perseverance of the large memory layers on the relocalization problem, while observing the spatial correlations between the images and the selected memory cells.

翻译：这项工作围绕最近为一些计算机视觉应用而实施的产品关键内存结构 \ cite{ grand_memory} 。内存结构可以被视为一种简单计算,原始的原始的、适合扩展至几乎所有神经网络结构。内存区块使得在内存能力方面可以使用平方根复杂度缩小的内存访问量。后一个缩放之所以可能,是因为将卡斯特尔产品空间分解纳入最近的邻居搜索关键空间。我们已经测试了分类、图像重建和重新定位问题方面的内存层,发现其中一些内存层在高度利用关键值元素的情况下可以提供显著的速度/准确性改进,而另一些则需要更仔细的微调,并受死钥匙的影响。为了解决后一个问题,我们引入了简单的记忆再感应技术, 这有助于我们从记忆中消除未使用的钥匙价值对配方, 并再次参与培训。我们进行了各种实验, 并改进了分类和PoseNet重新定位模型的速度和准确性。我们显示, 重新初始化对于某些关键值层可以带来巨大的速度/准确性改进速度/ 改进速度/ 精确性改进。我们显示, 在所观测的图像的重新定位上, 在所观测的图像的大规模的图像上, 我们观察到一个随机性级的图像的升级上, 也显示了上显示了一个随机性平整变。