Tensor decompositions are powerful tools for dimensionality reduction and feature interpretation of multidimensional data such as signals. Existing tensor decomposition objectives (e.g., Frobenius norm) are designed for fitting raw data under statistical assumptions, which may not align with downstream classification tasks. Also, real-world tensor data are usually high-ordered and have large dimensions with millions or billions of entries. Thus, it is expensive to decompose the whole tensor with traditional algorithms. In practice, raw tensor data also contains redundant information while data augmentation techniques may be used to smooth out noise in samples. This paper addresses the above challenges by proposing augmented tensor decomposition (ATD), which effectively incorporates data augmentations to boost downstream classification. To reduce the memory footprint of the decomposition, we propose a stochastic algorithm that updates the factor matrices in a batch fashion. We evaluate ATD on multiple signal datasets. It shows comparable or better performance (e.g., up to 15% in accuracy) over self-supervised and autoencoder baselines with less than 5% of model parameters, achieves 0.6% ~ 1.3% accuracy gain over other tensor-based baselines, and reduces the memory footprint by 9X when compared to standard tensor decomposition algorithms.
翻译:电离分解是维度减少和信号等多维数据特征解释的有力工具。现有的电离分解目标(如Frobenius 规范)旨在根据统计假设(可能与下游分类任务不相符)安装原始数据,这些假设可能与下游分类任务不相符。此外,现实世界的电离数据通常是高顺序的,其尺寸很大,有数百万或数十亿个条目。因此,用传统算法将整个电压分解是昂贵的。在实践上,原始电压数据还包含多余的信息,而数据增强技术可能用来在样本中平滑噪音。本文通过提议增强高压分解(ATD)来应对上述挑战,有效地将数据扩增纳入下游分类。为了减少分解的记忆足迹,我们建议一种随机算法,以批量方式更新要素矩阵。我们用多个信号数据集对ATD进行评估是昂贵的。它显示在自我监督的和自动分解基线(如精确度达15%)方面,可以使用数据增强的增强技术来缓解上述挑战。本文通过提议增强的推算法(ATD)来有效地吸收数据,其中的增强数据,有效地吸收数据,以推进下游分解数据,以推进下游分解的增强数据,以提升下分解后,以降低0.0.0.0.0.0.0.1xxx的基线,以达到0.1x的精确度。