About 50% of all queries on Snapchat app are targeted at finding the right friend to interact with. Since everyone has a unique list of friends and that list is not very large (maximum a few thousand), it makes sense to perform this search locally, on users' devices. In addition, the friend list is already available for other purposes, such as showing the chat feed, and the latency savings can be significant by avoiding a server round-trip call. Historically, we resorted to substring matching, ranking prefix matches at the top of the result list. Introducing the ability to perform fuzzy search on a resource-constrained device and in the environment where typo's are prevalent is both prudent and challenging. In this paper, we describe our efficient and accurate two-step approach to fuzzy search, characterized by a skip-bigram retrieval layer and a novel local Levenshtein distance computation used for final ranking.
翻译:Snapchat 应用程序的所有查询中,大约50%的查询都针对寻找可以互动的正确朋友。 因为每个人都有独特的朋友名单, 名单并不很大( 最多几千个), 因此在本地对用户设备进行这种搜索是有道理的。 此外, 朋友名单已经可供其他用途使用, 例如显示聊天种子, 避免服务器的圆脚调来节省时间。 从历史上看, 我们使用子字符串匹配, 在结果列表的顶端排序前缀匹配 。 引入对资源限制的设备进行模糊搜索的能力, 以及在 RBO 盛行的环境中进行模糊搜索的能力既谨慎又具有挑战性 。 在本文中, 我们描述了我们高效、 准确的两步搜索方法, 包括显示聊天的回放层, 以及用于最后排序的本地新版 Levestein 远程计算 。