The core challenge in numerous real-world applications is to match an inquiry to the best document from a mutable and finite set of candidates. Existing industry solutions, especially latency-constrained services, often rely on similarity algorithms that sacrifice quality for speed. In this paper we introduce a generic semantic learning-to-rank framework, Self-training Semantic Cross-attention Ranking (sRank). This transformer-based framework uses linear pairwise loss with mutable training batch sizes and achieves quality gains and high efficiency, and has been applied effectively to show gains on two industry tasks at Microsoft over real-world large-scale data sets: Smart Reply (SR) and Ambient Clinical Intelligence (ACI). In Smart Reply, $sRank$ assists live customers with technical support by selecting the best reply from predefined solutions based on consumer and support agent messages. It achieves 11.7% gain in offline top-one accuracy on the SR task over the previous system, and has enabled 38.7% time reduction in composing messages in telemetry recorded since its general release in January 2021. In the ACI task, sRank selects relevant historical physician templates that serve as guidance for a text summarization model to generate higher quality medical notes. It achieves 35.5% top-one accuracy gain, along with 46% relative ROUGE-L gain in generated medical notes.
翻译:许多现实世界的应用的核心挑战是将一个查询与一个可变和有限的候选文档集中的最佳文档匹配。现有的产业解决方案,尤其是对延迟敏感的服务,通常依赖于牺牲质量以换取速度的相似度算法。在本文中,我们介绍了一种通用的语义学习排名框架——自训练语义交叉关注排名(sRank)。该基于transformer的框架使用可变训练批次大小的线性成对损失,达到了高效且质量上升的效果,并已成功应用于微软两项实际大规模数据集上的任务:智能回复(SR)和环境临床智能(ACI)。在智能回复中,$sRank$通过基于消费者和支持代理的消息选择预定义解决方案中的最佳回复,为实时客户提供技术支持。相较于上一个系统,离线top-one准确率提高了11.7%,自2021年1月发布以来,在遥测数据中节省了38.7%的撰写消息时间。在ACI任务中,sRank通过选择相关的历史医生模板作为文本摘要模型生成更高质量的医学笔记的指南,取得了35.5%的top-one准确率提高,以及46%的相对ROUGE-L增益。