Recent large language models (LLMs) in the general domain, such as ChatGPT, have shown remarkable success in following instructions and producing human-like responses. However, such language models have yet to be adapted for the medical domain, resulting in poor accuracy of responses and an inability to provide sound advice on medical diagnoses, medications, etc. To address this problem, we fine-tuned our ChatDoctor model based on 100k real-world patient-physician conversations from an online medical consultation site. Besides, we add autonomous knowledge retrieval capabilities to our ChatDoctor, for example, Wikipedia or a disease database as a knowledge brain. By fine-tuning the LLMs using these 100k patient-physician conversations, our model showed significant improvements in understanding patients' needs and providing informed advice. The autonomous ChatDoctor model based on Wikipedia and Database Brain can access real-time and authoritative information and answer patient questions based on this information, significantly improving the accuracy of the model's responses, which shows extraordinary potential for the medical field with a low tolerance for error. To facilitate the further development of dialogue models in the medical field, we make available all source code, datasets, and model weights available at: https://github.com/Kent0n-Li/ChatDoctor.
翻译:近来,普通领域的大型语言模型(LLMs),如ChatGPT,在遵循指令和产生人类式响应方面表现出令人瞩目的成功。然而,这种语言模型尚未针对医学领域进行调整,结果导致响应精度低下和无法就医疗诊断、药物等问题提供合理建议。为了应对这个问题,我们基于在线医疗咨询网站的10万个真实患者-医生交谈Fine-tuned了聊天医生模型。同时,我们为我们的ChatDoctor增加了自主知识检索功能,例如维基百科或疾病数据库作为知识来源。通过使用这10万个患者-医生交谈Fine-tuned LLMs,我们的模型在理解患者需求和提供知情建议方面显示出了显著的改进。基于维基百科和数据库大脑的ChatDoctor能够访问实时和权势信息,并根据这些信息回答患者问题,从而显着提高模型的响应准确性,这对于容忍错误率较低的医疗领域具有极大的潜力。为了促进医学领域对话模型的进一步开发,我们https://github.com/Kent0n-Li/ChatDoctor提供所有源代码、数据集和模型权重。