State-of-the-art language models are often accurate on many question-answering benchmarks with well-defined questions. Yet, in real settings questions are often unanswerable without asking the user for clarifying information. We show that current SotA models often do not ask the user for clarification when presented with imprecise questions and instead provide incorrect answers or "hallucinate". To address this, we introduce CLAM, a framework that first uses the model to detect ambiguous questions, and if an ambiguous question is detected, prompts the model to ask the user for clarification. Furthermore, we show how to construct a scalable and cost-effective automatic evaluation protocol using an oracle language model with privileged information to provide clarifying information. We show that our method achieves a 20.15 percentage point accuracy improvement over SotA on a novel ambiguous question-answering answering data set derived from TriviaQA.
翻译:为了解决这个问题,我们引入了CLAM(CLAM)框架,这个框架首先使用该模型来探测模糊的问题,如果发现一个模糊的问题,则促使该模型要求用户澄清问题。此外,我们展示了如何使用具有特权信息来提供澄清信息的甲骨文模型构建一个可缩放和成本效益高的自动评估协议。我们展示了我们的方法在从TriviaQA获得的回答问题的新模棱两可的回答数据集上比SotA提高了20.15个百分点。