Modern data often take the form of a multiway array. However, most classification methods are designed for vectors, i.e., 1-way arrays. Distance weighted discrimination (DWD) is a popular high-dimensional classification method that has been extended to the multiway context, with dramatic improvements in performance when data have multiway structure. However, the previous implementation of multiway DWD was restricted to classification of matrices, and did not account for sparsity. In this paper, we develop a general framework for multiway classification which is applicable to any number of dimensions and any degree of sparsity. We conducted extensive simulation studies, showing that our model is robust to the degree of sparsity and improves classification accuracy when the data have multiway structure. For our motivating application, magnetic resonance spectroscopy (MRS) was used to measure the abundance of several metabolites across multiple neurological regions and across multiple time points in a mouse model of Friedreich's ataxia, yielding a four-way data array. Our method reveals a robust and interpretable multi-region metabolomic signal that discriminates the groups of interest. We also successfully apply our method to gene expression time course data for multiple sclerosis treatment. An R implementation is available in the package MultiwayClassification at http://github.com/lockEF/MultiwayClassification .
翻译:现代数据通常采取多路阵列的形式。然而,大多数分类方法都是针对矢量,即一线阵列设计的。远程加权差别(DWD)是一种流行的高维分类方法,它已经扩展到多路环境,当数据具有多路结构时,性能有了显著改善。然而,多路DWD以前的实施只限于对矩阵进行分类,而没有考虑宽度。在本文件中,我们为多路分类制定了一个通用的框架,它适用于任何多维度和任何程度的宽度。我们进行了广泛的模拟研究,表明我们的模型非常坚固,在数据具有多路结构时,提高了分类的准确性。对于我们的激励应用程序,采用了磁共振光谱光谱(MRS)来测量多神经区域和多时间点数种代谢物的丰度。在Friedreich的星系模型中,我们的方法显示一种坚固和可解释的多区域代谢信号,在数据具有宽度/宽度结构结构时程中,我们还成功地应用了用于RC/C多路化的基因阵列/多时间阵列处理方法。我们在Rqualmalifistrational rogration rogration rogration rogrational rogration rogration rogration rogration rogration rogration rogration rogration rogration rogration rogration rogrational rogismal.我们成功地在Rual