通过低等级和粗微分解实现个性化 (Large-scale Model Personalization via Low Rank and Sparse decomposition)

Personalization of machine learning (ML) predictions for individual users/domains/enterprises is critical for practical recommendation style systems. Standard personalization approaches involve learning a user/domain specific embedding that is fed into a fixed global model which can be limiting. On the other hand, personalizing/fine-tuning model itself for each user/domain -- a.k.a meta-learning -- has high storage/infrastructure cost. We propose a novel meta-learning style approach that models network weights as a sum of low-rank and sparse matrices. This captures common information from multiple individuals/users together in the low-rank part while sparse part captures user-specific idiosyncrasies. Furthermore, the framework is up to two orders of magnitude more scalable (in terms of storage/infrastructure cost) than user-specific finetuning of model. We then study the framework in the linear setting, where the problem reduces to that of estimating the sum of a rank-$r$ and a $k$-column sparse matrix using a small number of linear measurements. We propose an alternating minimization method with iterative hard thresholding -- AMHT-LRS -- to learn the low-rank and sparse part. For the realizable, Gaussian data setting, we show that AMHT-LRS solves the problem efficiently with nearly optimal samples. A significant challenge in personalization is ensuring privacy of each user's sensitive data. We alleviate this problem by proposing a differentially private variant of our method that also is equipped with strong generalization guarantees. Finally, on multiple standard recommendation datasets, we demonstrate that our approach allows personalized models to obtain superior performance in sparse data regime.

翻译：个人计算机学习(ML)对个人用户/域名/企业的预测个人化(ML)对于实用建议样式系统至关重要。标准个人化方法包括学习一个用户/域名特定嵌入,该嵌入可加以限制的固定全球模型。另一方面,每个用户/域名(a.k.a元学习)的个人化/调整模型本身与用户/域名(a.k.k. a元学习)相比,存储/基础设施费用高。我们建议一种新型的元学习风格,将网络加权作为低级和零散矩阵的总和。这从低级的多个个人/用户一起收集共同的信息,而稀薄的部分则捕捉到用户特定特性的多特性。此外,这个框架的大小规模可扩展/域名(储存/基础设施成本)比用户特定模型的微调多。然后我们在线性设置的框架中,将问题减少到估算一等值和稀薄的基数的基质矩阵。我们建议采用一个交替的最小的最小化方法,用最精确的系统(AAM-L) 来显示我们最难的精确的精确的样本数据。