Prior work on scientific question answering has largely emphasized chatbot-style systems, with limited exploration of fine-tuning foundation models for domain-specific reasoning. In this study, we developed a chatbot for the University of Limerick's Department of Electronic and Computer Engineering to provide course information to students. A custom dataset of 1,203 question-answer pairs in SQuAD format was constructed using the university book of modules, supplemented with manually and synthetically generated entries. We fine-tuned BERT (Devlin et al., 2019) using PyTorch and evaluated performance with Exact Match and F1 scores. Results show that even modest fine-tuning improves hypothesis framing and knowledge extraction, demonstrating the feasibility of adapting foundation models to educational domains. While domain-specific BERT variants such as BioBERT and SciBERT exist for biomedical and scientific literature, no foundation model has yet been tailored to university course materials. Our work addresses this gap by showing that fine-tuning BERT with academic QA pairs yields effective results, highlighting the potential to scale towards the first domain-specific QA model for universities and enabling autonomous educational knowledge systems.
翻译:先前关于科学问答的研究主要侧重于聊天机器人式系统,对于微调基础模型以进行领域特定推理的探索较为有限。本研究为利默里克大学电子与计算机工程系开发了一款聊天机器人,旨在向学生提供课程信息。我们利用大学课程模块手册构建了一个包含1,203个SQuAD格式问答对的定制数据集,并通过人工和合成生成的方式进行了补充。使用PyTorch对BERT(Devlin等人,2019)进行微调,并通过精确匹配率和F1分数评估性能。结果表明,即使进行适度微调也能提升假设构建和知识提取能力,证明了将基础模型适配到教育领域的可行性。尽管存在针对生物医学和科学文献的领域特定BERT变体(如BioBERT和SciBERT),但目前尚未有专门针对大学课程材料的基础模型。我们的研究通过展示使用学术问答对微调BERT能取得有效成果,填补了这一空白,凸显了构建首个面向大学的领域特定问答模型并实现自主教育知识系统的扩展潜力。