Event extraction is a fundamental task in natural language processing that involves identifying and extracting information about events mentioned in text. However, it is a challenging task due to the lack of annotated data, which is expensive and time-consuming to obtain. The emergence of large language models (LLMs) such as ChatGPT provides an opportunity to solve language tasks with simple prompts without the need for task-specific datasets and fine-tuning. While ChatGPT has demonstrated impressive results in tasks like machine translation, text summarization, and question answering, it presents challenges when used for complex tasks like event extraction. Unlike other tasks, event extraction requires the model to be provided with a complex set of instructions defining all event types and their schemas. To explore the feasibility of ChatGPT for event extraction and the challenges it poses, we conducted a series of experiments. Our results show that ChatGPT has, on average, only 51.04% of the performance of a task-specific model such as EEQA in long-tail and complex scenarios. Our usability testing experiments indicate that ChatGPT is not robust enough, and continuous refinement of the prompt does not lead to stable performance improvements, which can result in a poor user experience. Besides, ChatGPT is highly sensitive to different prompt styles.
翻译:自然语言处理过程中,事件提取是一项基本任务,涉及识别和提取关于文本中提到的事件的信息,然而,由于缺少附加说明的数据,这是一项具有挑战性的任务,因为缺乏昂贵和耗时才能获得的附加说明的数据。像查特格伯特这样的大型语言模型(LLMs)的出现提供了一次机会,用简单的提示解决语言任务,而不需要特定任务数据集和微调。查特格伯特在机器翻译、文本摘要化和回答等问题等任务中表现出了令人印象深刻的结果,但在用于诸如事件提取等复杂任务时,它提出了挑战。与其他任务不同,事件提取要求为模型提供一套复杂的指令,界定所有事件类型及其形式。为了探索查特特特对事件提取的可行性及其带来的挑战,我们进行了一系列实验。我们的结果表明,查特格普特平均只有51.04%的诸如EQA等具体任务模型在长期和复杂情景中的表现。我们的可用性测试表明,查特特普特不够健全,而且不断改进及时性不能导致高科技的快速性能改进。此外,查普利尔夫。</s>