To improve online search results, clarification questions can be used to elucidate the information need of the user. This research aims to predict the user engagement with the clarification pane as an indicator of relevance based on the lexical information: query, question, and answers. Subsequently, the predicted user engagement can be used as a feature to rank the clarification panes. Regression and classification are applied for predicting user engagement and compared to naive heuristic baselines (e.g. mean) on the new MIMICS dataset [20]. An ablation study is carried out using a RankNet model to determine whether the predicted user engagement improves clarification pane ranking performance. The prediction models were able to improve significantly upon the naive baselines, and the predicted user engagement feature significantly improved the RankNet results in terms of NDCG and MRR. This research demonstrates the potential for ranking clarification panes based on lexical information only and can serve as a first neural baseline for future research to improve on. The code is available online.
翻译:为了改进在线搜索结果,可以使用澄清问题来说明用户对信息的需求。这项研究旨在预测用户参与澄清工作的情况,以此作为基于词汇信息的相关性指标:查询、问答。随后,预测用户参与可以用作对澄清工作进行排序的特征。对用户参与进行回归和分类用于预测用户参与情况,并与新的MIMIMSS数据集中的天真的超自然基线(例如平均数)进行比较。利用RankNet模型进行一项反向研究,以确定预计用户参与是否改进了澄清工作的总体性能。预测模型能够在天真基线的基础上大大改进,预测用户参与显著改进了RangNet在NDCG和MRR方面的结果。这一研究表明,仅根据词汇信息进行排序澄清工作的潜力,并且可以作为未来改进研究的第一个神经基线。代码可以在线查阅。