Users often formulate their search queries with immature language without well-developed keywords and complete structures. Such queries fail to express their true information needs and raise ambiguity as fragmental language often yield various interpretations and aspects. This gives search engines a hard time processing and understanding the query, and eventually leads to unsatisfactory retrieval results. An alternative approach to direct answer while facing an ambiguous query is to proactively ask clarifying questions to the user. Recent years have seen many works and shared tasks from both NLP and IR community about identifying the need for asking clarifying question and methodology to generate them. An often neglected fact by these works is that although sometimes the need for clarifying questions is correctly recognized, the clarifying questions these system generate are still off-topic and dissatisfaction provoking to users and may just cause users to leave the conversation. In this work, we propose a risk-aware conversational search agent model to balance the risk of answering user's query and asking clarifying questions. The agent is fully aware that asking clarifying questions can potentially collect more information from user, but it will compare all the choices it has and evaluate the risks. Only after then, it will make decision between answering or asking. To demonstrate that our system is able to retrieve better answers, we conduct experiments on the MSDialog dataset which contains real-world customer service conversations from Microsoft products community. We also purpose a reinforcement learning strategy which allows us to train our model on the original dataset directly and saves us from any further data annotation efforts. Our experiment results show that our risk-aware conversational search agent is able to significantly outperform strong non-risk-aware baselines.
翻译:用户往往以不成熟的语言提出搜索询问,而没有完善的关键词和完整的结构。这些查询没有表达他们真正的信息需求,也没有提出模糊不清之处,因为零散语言往往产生不同的解释和方面。这给搜索引擎带来一个艰难的时间处理和理解查询,最终导致检索结果不满意。面对模糊的查询时,另一种直接回答的方法是主动向用户提出澄清问题。近年来,国家实验室和IR社区在确定需要要求澄清问题和生成问题的方法方面做了许多工作和共同任务。这些工作经常忽视的一个事实是,尽管有时需要澄清问题,但人们正确地认识到,这些系统产生的澄清问题仍然是与用户脱节和不满意的问题,这可能会使用户离开对话。在这项工作中,我们提出了一个有风险意识的谈话搜索代理模式,以平衡回答用户询问问题和提出澄清问题的风险。近年来,国家实验室和IR社区都充分意识到,要求澄清问题可以从用户那里收集更多的信息,但能比较所有的选择,并评估风险。只有在这样以后,这些系统产生的澄清问题仍然是非主题性的,这些系统产生的问题仍然引起用户不满和不满,使用户感到不满和不满。 我们的检索的搜索或询问的检索过程的检索系统能够使我们的系统能够进行真正的数据库的检索。我们的数据系统能够改进的检索。我们的数据记录。我们的数据系统能使我们的系统能够使我们的系统能够复制一个检索。