ChatGPT is a recent chatbot service released by OpenAI and is receiving increasing attention over the past few months. While evaluations of various aspects of ChatGPT have been done, its robustness, i.e., the performance when facing unexpected inputs, is still unclear to the public. Robustness is of particular concern in responsible AI, especially for safety-critical applications. In this paper, we conduct a thorough evaluation of the robustness of ChatGPT from the adversarial and out-of-distribution (OOD) perspective. To do so, we employ the AdvGLUE and ANLI benchmarks to assess adversarial robustness and the Flipkart review and DDXPlus medical diagnosis datasets for OOD evaluation. We select several popular foundation models as baselines. Results show that ChatGPT does not show consistent advantages on adversarial and OOD classification tasks, while performing well on translation tasks. This suggests that adversarial and OOD robustness remains a significant threat to foundation models. Moreover, ChatGPT shows astounding performance in understanding dialogue-related texts and we find that it tends to provide informal suggestions for medical tasks instead of definitive answers. Finally, we present in-depth discussions of possible research directions.
翻译:虽然在过去几个月里,已经对ChattGPT的各方面进行了评估,但公众仍然不清楚它的稳健性,即面对意外投入时的表现。强健性在负责任的AI中尤其令人关切,特别是在安全关键应用方面。在本文件中,我们从对抗和分配(OOOD)的角度对ChatGPT的稳健性进行了彻底评估。为了做到这一点,我们使用AdvGLUE和ANLI基准来评估对抗性强力和Flipkart审查和DDXPlus医疗诊断数据集。我们选择了若干流行的基础模型作为基线。结果显示,ChatGPT在对抗和OODD分类任务方面没有表现出一贯的优势,而同时又在完成翻译任务方面表现良好。这表明,从对抗性和ODD强力性对基础模型来说仍然是一个重大威胁。此外,ChatGPT在理解与对话有关的文本方面表现出了可感知性的表现,我们发现它倾向于为医学任务提供非正式的建议,而不是在深入的答案方面提出非正式的建议。