In myriad statistical applications, data are collected from related but heterogeneous sources. These sources share some commonalities while containing idiosyncratic characteristics. More specifically, consider the setting where observation matrices from $N$ sources $\{M_{i}\}_{i=1}^N$ are generated from a few common and source-specific factors. Is it possible to recover the shared and source-specific factors? We show that under appropriate conditions on the alignment of source-specific factors, the problem is well-defined and both shared and source-specific factors are identifiable under a constrained matrix factorization objective. To solve this objective, we propose a new class of matrix factorization algorithms, called Heterogeneous Matrix Factorization. HMF is easy to implement, enjoys local linear convergence under suitable assumptions, and is intrinsically distributed. Through a variety of empirical studies, we showcase the advantageous properties of HMF and its potential application in feature extraction and change detection.
翻译:暂无翻译