Pooling layers are essential building blocks of convolutional neural networks (CNNs), to reduce computational overhead and increase the receptive fields of proceeding convolutional operations. Their goal is to produce downsampled volumes that closely resemble the input volume while, ideally, also being computationally and memory efficient. Meeting both these requirements remains a challenge. To this end, we propose an adaptive and exponentially weighted pooling method: adaPool. Our method learns a regional-specific fusion of two sets of pooling kernels that are based on the exponent of the Dice-Sorensen coefficient and the exponential maximum, respectively. AdaPool improves the preservation of detail on a range of tasks including image and video classification and object detection. A key property of adaPool is its bidirectional nature. In contrast to common pooling methods, the learned weights can also be used to upsample activation maps. We term this method adaUnPool. We evaluate adaUnPool on image and video super-resolution and frame interpolation. For benchmarking, we introduce Inter4K, a novel high-quality, high frame-rate video dataset. Our experiments demonstrate that adaPool systematically achieves better results across tasks and backbones, while introducing a minor additional computational and memory overhead.
翻译:聚积层是进化神经网络(CNNs)的基本构件,可以减少计算间接费用,增加进化操作的可接受领域。它们的目标是制作与输入量非常相似的下印卷,而理想的是,也可以进行计算和记忆效率。满足这两个要求仍然是个挑战。我们为此建议采用适应性和指数加权集合法: adaPool。我们的方法学习了一套区域特有的集合内核组合,这两类集合内核分别以Dice-Sorensen系数和指数最大值为基础。AdaPool改进了一系列任务的详细保存,包括图像和视频分类以及对象探测。adaPool的关键属性是其双向性质。与共同的集合法不同,我们称之为 adaUnpool。我们用图像和视频超分辨率和框架内插法来评价adaunpool的集合内核。我们为基准制定基准,我们引入了一种新型的高质量、高质、高基数的模型实验,同时我们有系统化地进行一个高基数据计算。