Automated mental health analysis shows great potential for enhancing the efficiency and accessibility of mental health care, with recent methods using pre-trained language models (PLMs) and incorporated emotional information. The latest large language models (LLMs), such as ChatGPT, exhibit dramatic capabilities on diverse natural language processing tasks. However, existing studies on ChatGPT for mental health analysis bear limitations in inadequate evaluations, ignorance of emotional information, and lack of explainability. To bridge these gaps, we comprehensively evaluate the mental health analysis and emotional reasoning ability of ChatGPT on 11 datasets across 5 tasks, and analyze the effects of various emotion-based prompting strategies. Based on these prompts, we further explore LLMs for interpretable mental health analysis by instructing them to also generate explanations for each of their decisions. With an annotation protocol designed by domain experts, we convey human evaluations to assess the quality of explanations generated by ChatGPT and GPT-3. The annotated corpus will be released for future research. Experimental results show that ChatGPT outperforms traditional neural network-based methods but still has a significant gap with advanced task-specific methods. Prompt engineering with emotional cues can be effective in improving performance on mental health analysis but suffers from a lack of robustness and inaccurate reasoning. In addition, ChatGPT significantly outperforms GPT-3 on all criteria in human evaluations of the explanations and approaches to human performance, showing its great potential in explainable mental health analysis.
翻译:暂无翻译