Voice assistants provide users a new way of interacting with digital products, allowing them to retrieve information and complete tasks with an increased sense of control and flexibility. Such products are comprised of several machine learning models, like Speech-to-Text transcription, Named Entity Recognition and Resolution, and Text Classification. Building a voice assistant from scratch takes the prolonged efforts of several teams constructing numerous models and orchestrating between components. Alternatives such as using third-party vendors or re-purposing existing models may be considered to shorten time-to-market and development costs. However, each option has its benefits and drawbacks. We present key insights from building a voice search assistant for Booking.com search and recommendation system. Our paper compares the achieved performance and development efforts in dedicated tailor-made solutions against existing re-purposed models. We share and discuss our data-driven decisions about implementation trade-offs and their estimated outcomes in hindsight, showing that a fully functional machine learning product can be built from existing models.
翻译:语音助理为用户提供了一种与数字产品进行互动的新方式,使他们能够检索信息和完成任务,并具有更大的控制和灵活性。这类产品由若干机器学习模式组成,如语音到文字转录、名称实体识别和分辨率以及文本分类。从零到零建立一个语音助理需要几个团队的长期努力,以构建许多模型,并在各组成部分之间进行协调。可以考虑使用第三方供应商或重新定位现有模型等替代办法,以缩短时间到市场和发展的成本。然而,每个选项都有其好处和缺点。我们介绍了从为 Booking.com 搜索和建议系统建立一个语音搜索助理中得出的关键见解。我们的文件比较了在专用定制解决方案中实现的业绩和发展努力与现有重新设计模式的对比。我们分享和讨论我们关于实施交易的数据驱动决定及其事后估计结果,表明可以从现有模型中建立功能齐全的机器学习产品。