Online personalized recommendation services are generally hosted in the cloud where users query the cloud-based model to receive recommended input such as merchandise of interest or news feed. State-of-the-art recommendation models rely on sparse and dense features to represent users' profile information and the items they interact with. Although sparse features account for 99% of the total model size, there was not enough attention paid to the potential information leakage through sparse features. These sparse features are employed to track users' behavior, e.g., their click history, object interactions, etc., potentially carrying each user's private information. Sparse features are represented as learned embedding vectors that are stored in large tables, and personalized recommendation is performed by using a specific user's sparse feature to index through the tables. Even with recently-proposed methods that hides the computation happening in the cloud, an attacker in the cloud may be able to still track the access patterns to the embedding tables. This paper explores the private information that may be learned by tracking a recommendation model's sparse feature access patterns. We first characterize the types of attacks that can be carried out on sparse features in recommendation models in an untrusted cloud, followed by a demonstration of how each of these attacks leads to extracting users' private information or tracking users by their behavior over time.
翻译:在线个人化建议服务通常在云层中进行,用户在云中查询云基模型,以接收建议的投入,如有兴趣的商品或新闻反馈等。最先进的建议模式依靠稀少和密集的特点来代表用户的剖析信息及其与之互动的物品。虽然稀疏的特点占了模型总大小的99%,但云中攻击者可能仍然能够跟踪嵌入表格的存取模式。本文探讨了通过跟踪建议模型的稀疏特征访问模式而可能学到的私人信息。我们首先描述了在建议模型中以稀疏的矢量存储在大表格中,个人化建议是通过使用特定的用户的稀少特征在表格中进行索引来完成的。即使最近提出的方法掩盖了在云中发生的计算,云中攻击者也可能仍然能够跟踪嵌入表格的存取模式。本文探讨了通过跟踪建议模型的稀疏地特征访问模式而可能学到的私人信息。我们首先描述在建议模型中可以以稀疏的嵌式嵌入矢量进行的攻击类型,然后用个人化的模型在不可靠时间模型中进行检索的云层。