KEO：基于知识图谱与RAG的航空安全关键维护OMIn知识提取框架 (KEO: Knowledge Extraction on OMIn via Knowledge Graphs and RAG for Safety-Critical Aviation Maintenance)

We present Knowledge Extraction on OMIn (KEO), a domain-specific knowledge extraction and reasoning framework with large language models (LLMs) in safety-critical contexts. Using the Operations and Maintenance Intelligence (OMIn) dataset, we construct a QA benchmark spanning global sensemaking and actionable maintenance tasks. KEO builds a structured Knowledge Graph (KG) and integrates it into a retrieval-augmented generation (RAG) pipeline, enabling more coherent, dataset-wide reasoning than traditional text-chunk RAG. We evaluate locally deployable LLMs (Gemma-3, Phi-4, Mistral-Nemo) and employ stronger models (GPT-4o, Llama-3.3) as judges. Experiments show that KEO markedly improves global sensemaking by revealing patterns and system-level insights, while text-chunk RAG remains effective for fine-grained procedural tasks requiring localized retrieval. These findings underscore the promise of KG-augmented LLMs for secure, domain-specific QA and their potential in high-stakes reasoning.

翻译：本文提出KEO（Knowledge Extraction on OMIn），一种面向安全关键领域、基于大语言模型（LLM）的领域知识提取与推理框架。利用运营与维护智能（OMIn）数据集，我们构建了涵盖全局态势理解与可执行维护任务的问答基准。KEO通过构建结构化知识图谱（KG）并将其集成至检索增强生成（RAG）流程，实现了比传统文本分块RAG更连贯、覆盖全数据集的推理能力。我们评估了可本地部署的LLM（Gemma-3、Phi-4、Mistral-Nemo），并采用更强模型（GPT-4o、Llama-3.3）作为评判器。实验表明，KEO通过揭示模式与系统级洞察，显著提升了全局态势理解能力；而文本分块RAG在需要局部检索的细粒度程序性任务中仍保持优势。这些发现印证了知识图谱增强的LLM在安全、领域特定的问答任务中的潜力，及其在高风险推理场景中的应用前景。