Large language models (LLMs) show promise for aiding graduate level education, but are limited by their training data and potential confabulations. We developed ChemTAsk, an open-source pipeline that combines LLMs with retrieval-augmented generation (RAG) to provide accurate, context-specific assistance. ChemTAsk utilizes course materials, including lecture transcripts and primary publications, to generate accurate responses to student queries. Over nine weeks in an advanced biological chemistry course at the University of Pennsylvania, students could opt in to use ChemTAsk for assistance in any assignment or to understand class material. Comparative analysis showed ChemTAsk performed on par with human teaching assistants (TAs) in understanding student queries and providing accurate information, particularly excelling in creative problem-solving tasks. In contrast, TAs were more precise in their responses and tailored their assistance to the specifics of the class. Student feedback indicated that ChemTAsk was perceived as correct, helpful, and faster than TAs. Open-source and proprietary models from Meta and OpenAI respectively were tested on an original biological chemistry benchmark for future iterations of ChemTAsk. It was found that OpenAI models were more tolerant to deviations in the input prompt and excelled in self-assessment to safeguard for potential confabulations. Taken together, ChemTAsk demonstrates the potential of integrating LLMs with RAG to enhance educational support, offering a scalable tool for students and educators.
翻译:暂无翻译