Mental health illness represents a substantial global socioeconomic burden, with COVID-19 further exacerbating accessibility challenges and driving increased demand for telehealth mental health support. While large language models (LLMs) offer promising solutions through 24/7 availability and non-judgmental interactions, pre-trained models often lack the contextual and emotional awareness necessary for appropriate therapeutic responses. This paper investigated the application of supervised fine-tuning (SFT) and reinforcement learning (RL) techniques to enhance GPT-2's capacity for therapeutic dialogue generation. The methodology restructured input formats to enable simultaneous processing of contextual information and emotional states alongside user input, employing a multi-component reward function that aligned model outputs with professional therapist responses and annotated emotions. Results demonstrated improvements through reinforcement learning over baseline GPT-2 across multiple evaluation metrics: BLEU (0.0111), ROUGE-1 (0.1397), ROUGE-2 (0.0213), ROUGE-L (0.1317), and METEOR (0.0581). LLM evaluation confirmed high contextual relevance and professionalism, while reinforcement learning achieved 99.34% emotion accuracy compared to 66.96% for baseline GPT-2. These findings demonstrate reinforcement learning's effectiveness in developing therapeutic dialogue systems that can serve as valuable assistive tools for therapists while maintaining essential human clinical oversight.
翻译:心理健康疾病构成了巨大的全球社会经济负担,而COVID-19进一步加剧了服务可及性挑战,并推动了远程心理健康支持需求的增长。尽管大型语言模型(LLMs)通过全天候可用性和非评判性互动提供了有前景的解决方案,但预训练模型通常缺乏生成适当治疗性回应所需的情境和情感感知能力。本文研究了监督微调(SFT)和强化学习(RL)技术在增强GPT-2治疗性对话生成能力方面的应用。该方法重构了输入格式,使其能够同时处理情境信息、情感状态及用户输入,并采用多组件奖励函数,使模型输出与专业治疗师回应及标注情感保持一致。结果显示,强化学习在多项评估指标上均优于基线GPT-2模型:BLEU(0.0111)、ROUGE-1(0.1397)、ROUGE-2(0.0213)、ROUGE-L(0.1317)和METEOR(0.0581)。LLM评估证实了模型具有高情境相关性和专业性,而强化学习实现了99.34%的情感准确率,显著高于基线GPT-2的66.96%。这些发现证明了强化学习在开发治疗性对话系统方面的有效性,该系统可作为治疗师有价值的辅助工具,同时保持必要的人类临床监督。