Cognitive psychology delves on understanding perception, attention, memory, language, problem-solving, decision-making, and reasoning. Large language models (LLMs) are emerging as potent tools increasingly capable of performing human-level tasks. The recent development in the form of GPT-4 and its demonstrated success in tasks complex to humans exam and complex problems has led to an increased confidence in the LLMs to become perfect instruments of intelligence. Although GPT-4 report has shown performance on some cognitive psychology tasks, a comprehensive assessment of GPT-4, via the existing well-established datasets is required. In this study, we focus on the evaluation of GPT-4's performance on a set of cognitive psychology datasets such as CommonsenseQA, SuperGLUE, MATH and HANS. In doing so, we understand how GPT-4 processes and integrates cognitive psychology with contextual information, providing insight into the underlying cognitive processes that enable its ability to generate the responses. We show that GPT-4 exhibits a high level of accuracy in cognitive psychology tasks relative to the prior state-of-the-art models. Our results strengthen the already available assessments and confidence on GPT-4's cognitive psychology abilities. It has significant potential to revolutionize the field of AI, by enabling machines to bridge the gap between human and machine reasoning.
翻译:认知心理学涉及理解感知、注意、记忆、语言、问题解决、决策和推理等领域。大型语言模型 (LLMs) 正在成为越来越强大的工具,能够完成人类级别的任务。最近的 GPT-4 发展以及它在复杂问题上的表现,进一步增强了 LLMs 成为完美智能工具的信心。虽然 GPT-4 的报告展示了其在某些认知心理学任务上的表现,但需要对 GPT-4 进行一系列的评估,以充分了解其认知心理学处理特征。在本研究中,我们将重点评估 GPT-4 在共性QA、SuperGLUE、MATH 和HANS等认知心理学数据集上的表现。在这个过程中,我们了解了GPT-4如何处理并整合认知心理学与上下文信息,揭示了支撑其生成响应的潜在认知过程。我们的结果表明,在与之前的先进模型进行比较时,GPT-4在认知心理学任务方面表现出高精度。我们的研究结果进一步加强了人们对 GPT-4 认知心理学能力的信心,它有潜力通过实现人机推理的无缝衔接来彻底颠覆AI领域。