We propose an end-to-end trainable framework that processes large-scale visual data tensors by looking \emph{at a fraction of their entries only}. Our method combines a neural network encoder with a \emph{tensor train decomposition} to learn a low-rank latent encoding, coupled with cross-approximation (CA) to learn the representation through a subset of the original samples. CA is an adaptive sampling algorithm that is native to tensor decompositions and avoids working with the full high-resolution data explicitly. Instead, it actively selects local representative samples that we fetch out-of-core and on-demand. The required number of samples grows only logarithmically with the size of the input. Our implicit representation of the tensor in the network enables processing large grids that could not be otherwise tractable in their uncompressed form. The proposed approach is particularly useful for large-scale multidimensional grid data (e.g., 3D tomography), and for tasks that require context over a large receptive field (e.g., predicting the medical condition of entire organs). The code will be available at https://github.com/aelphy/c-pic
翻译:我们建议一个端到端的可训练框架, 处理大型视觉数据分解, 其方法是通过查看 \ emph{ tat a 片段的条目来处理 大型视觉数据 。 我们的方法是将神经网络编码器与 emph{ tensor 列列分解集结合起来, 学习低级潜伏编码, 加上交叉协调( CA) 来通过原始样本的一个子集来学习显示。 CA 是适应性抽样算法, 其本源是高温分解的, 并避免与全部高分辨率数据合作 。 相反, 它积极选择我们取出的核心和随需要取的具有当地代表性的样本。 所需的样本数量随着输入量的大小而成的对数增长。 我们在网络中隐含的数使大网格能够处理无法以其他方式在未压缩的形态下移动的显示。 提议的方法对于大型多维格数据( 如 3D tomography) 特别有用, 以及对于需要大可接受域域域域域( 如. g. 预测 / org/ 整个器官的医学状态) 将会 的代码将特别有用 。