Similarity-driven multi-view linear reconstruction (SiMLR) is an algorithm that exploits inter-modality relationships to transform large scientific datasets into smaller, more well-powered and interpretable low-dimensional spaces. SiMLR contributes a novel objective function for identifying joint signal, regularization based on sparse matrices representing prior within-modality relationships and an implementation that permits application to joint reduction of large data matrices, each of which may have millions of entries. We demonstrate that SiMLR outperforms closely related methods on supervised learning problems in simulation data, a multi-omics cancer survival prediction dataset and multiple modality neuroimaging datasets. Taken together, this collection of results shows that SiMLR may be applied with default parameters to joint signal estimation from disparate modalities and may yield practically useful results in a variety of application domains.
翻译:由相似性驱动的多视图线性重建(SiMLR)是一种算法,它利用多种模式关系将大型科学数据集转换成较小、更强大和可解释的低维空间。SiMLR贡献了一种新的客观功能,即确定联合信号,在代表先前的内现代关系的稀疏基质基础上进行正规化,并实行允许联合减少大型数据矩阵的应用,其中每个矩阵可能都有数以百万计的条目。我们证明,SiMLR在模拟数据、多组合癌症生存预测数据集和多种模式神经成型数据集中与监督学习问题密切相关的方法相形色色。综合起来,这一收集的结果显示,SimMLR可以使用默认参数,从不同模式中联合发出信号,并可能在各种应用领域产生实际有用的结果。