计算机视觉和深深学习的教学方法 (Tensor Methods in Computer Vision and Deep Learning)

Tensors, or multidimensional arrays, are data structures that can naturally represent visual data of multiple dimensions. Inherently able to efficiently capture structured, latent semantic spaces and high-order interactions, tensors have a long history of applications in a wide span of computer vision problems. With the advent of the deep learning paradigm shift in computer vision, tensors have become even more fundamental. Indeed, essential ingredients in modern deep learning architectures, such as convolutions and attention mechanisms, can readily be considered as tensor mappings. In effect, tensor methods are increasingly finding significant applications in deep learning, including the design of memory and compute efficient network architectures, improving robustness to random noise and adversarial attacks, and aiding the theoretical understanding of deep networks. This article provides an in-depth and practical review of tensors and tensor methods in the context of representation learning and deep learning, with a particular focus on visual data analysis and computer vision applications. Concretely, besides fundamental work in tensor-based visual data analysis methods, we focus on recent developments that have brought on a gradual increase of tensor methods, especially in deep learning architectures, and their implications in computer vision applications. To further enable the newcomer to grasp such concepts quickly, we provide companion Python notebooks, covering key aspects of the paper and implementing them, step-by-step with TensorLy.

翻译：电导或多维阵列是能够自然地代表多个维度的视觉数据的数据结构。自然地,电导或多维阵列是能够有效捕捉结构化、潜伏的语义空间和高阶互动的数据结构。高频在计算机视野的广泛问题中有着长期的应用历史。随着计算机视觉的深层次学习范式转变的到来, 电导甚至变得更加重要。事实上, 现代深层学习结构中的基本成分, 如变迁和关注机制, 可以很容易地被视作高频绘图。实际上, 高频方法正在越来越多地在深层学习中找到重要的应用, 包括设计记忆和计算高效的网络结构, 提高随机噪音和对抗性攻击的稳健性, 以及帮助对深层网络的理论理解。文章提供了对数字和高频方法的深入和实用审查, 特别是以视觉分析和计算机视觉应用为焦点。具体地说, 除了基于高压的视觉数据分析方法的基本工作之外, 我们侧重于最近的发展, 导致蒸发方法的逐步增加,, 特别是深层次学习结构和对抗结构, 以及计算机视觉应用中的关键概念。