Embeddings, low-dimensional vector representation of objects, are fundamental in building modern machine learning systems. In industrial settings, there is usually an embedding team that trains an embedding model to solve intended tasks (e.g., product recommendation). The produced embeddings are then widely consumed by consumer teams to solve their unintended tasks (e.g., fraud detection). However, as the embedding model gets updated and retrained to improve performance on the intended task, the newly-generated embeddings are no longer compatible with the existing consumer models. This means that historical versions of the embeddings can never be retired or all consumer teams have to retrain their models to make them compatible with the latest version of the embeddings, both of which are extremely costly in practice. Here we study the problem of embedding version updates and their backward compatibility. We formalize the problem where the goal is for the embedding team to keep updating the embedding version, while the consumer teams do not have to retrain their models. We develop a solution based on learning backward compatible embeddings, which allows the embedding model version to be updated frequently, while also allowing the latest version of the embedding to be quickly transformed into any backward compatible historical version of it, so that consumer teams do not have to retrain their models. Under our framework, we explore six methods and systematically evaluate them on a real-world recommender system application. We show that the best method, which we call BC-Aligner, maintains backward compatibility with existing unintended tasks even after multiple model version updates. Simultaneously, BC-Aligner achieves the intended task performance similar to the embedding model that is solely optimized for the intended task.
翻译:嵌入式、 低维矢量的表达方式是建立现代机器学习系统的基础。 在工业环境中,通常有一个嵌入式团队, 训练嵌入式的嵌入式解决预定任务( 如产品建议) 。 制成的嵌入式随后被消费团队广泛消费以解决其无意的任务( 如欺诈检测 ) 。 然而, 随着嵌入式模式得到更新和再培训以改善预定任务的业绩, 新生成的嵌入式不再与现有的消费模式兼容。 这意味着嵌入式的历史版本永远无法被淘汰, 或所有消费团队必须重新配置其模型, 使之与最新版本的嵌入式模式兼容( 例如产品建议 ) 。 我们在这里研究嵌入式模块的嵌入式嵌入器问题及其落后兼容性。 当嵌入型团队的目标是不断更新嵌入式时, 消费者团队不必再重新配置自己的模型。 我们开发了一个解决方案, 以学习后向后嵌入式的嵌入式系统为基础, 甚至让嵌入式模型的嵌入式模型经常更新, 同时, 也允许更新最新版本的内嵌入式的内嵌入模式 。 在历史任务中, 格式下, 将更新的流程中, 运行的版本 将显示的版本 重置式任务重新显示的版本 。