We consider the problem of extracting joint and individual signals from multi-view data, that is data collected from different sources on matched samples. While existing methods for multi-view data decomposition explore single matching of data by samples, we focus on double-matched multi-view data (matched by both samples and source features). Our motivating example is the miRNA data collected from both primary tumor and normal tissues of the same subjects; the measurements from two tissues are thus matched both by subjects and by miRNAs. Our proposed double-matched matrix decomposition allows to simultaneously extract joint and individual signals across subjects, as well as joint and individual signals across miRNAs. Our estimation approach takes advantage of double-matching by formulating a new type of optimization problem with explicit row space and column space constraints, for which we develop an efficient iterative algorithm. Numerical studies indicate that taking advantage of double-matching leads to superior signal estimation performance compared to existing multi-view data decomposition based on single-matching. We apply our method to miRNA data as well as data from the English Premier League soccer matches, and find joint and individual multi-view signals that align with domain specific knowledge.
翻译:我们考虑了从多视角数据中提取联合和单个信号的问题,即从不同来源收集的相匹配样本的数据。虽然现有的多视角数据分解方法探索通过样本对数据进行单一匹配,但我们侧重于双匹配的多视图数据(与样本和源地特征相匹配)。我们的激励例子是从同一对象的初级肿瘤和正常组织中采集的MIRNA数据;因此,从两个组织中采集的测量结果与对象和MIRNAs的测量结果相匹配。我们提议的双匹配矩阵分解方法允许同时提取不同对象的联合和单个信号,以及MIRNAs之间的联合和个人信号。我们的估算方法利用了双匹配的优势,即开发出一种带有明确行间空间和列内空间限制的新型优化问题,为此我们开发了高效的迭代算法。数字研究表明,利用双匹配的导线使信号估计性能优于基于单匹配的现有多视图数据分解。我们使用的方法来提取 MIRNA数据以及来自英国总理级足球校准的数据,并找到与具体领域知识的联合和单个多视角信号。