For a natural language problem that requires some non-trivial reasoning to solve, there are at least two ways to do it using a large language model (LLM). One is to ask it to solve it directly. The other is to use it to extract the facts from the problem text and then use a theorem prover to solve it. In this note, we compare the two methods using ChatGPT and GPT4 on a series of logic word puzzles, and conclude that the latter is the right approach.
翻译:对于一个需要一些非平凡推理才能解决的自然语言问题,至少有两种方法可以使用大型语言模型(LLM)来解决。一种方法是让它直接解决问题。另一种方法是使用它从问题文本中提取事实,然后使用定理证明器来解决问题。在本文中,我们将使用ChatGPT和GPT4对一系列逻辑单词谜题进行比较,得出后者是正确的方法。