Large Language Models (LLMs) demonstrate remarkable capabilities in text generation, yet their emotional consistency and semantic coherence in social media contexts remain insufficiently understood. This study investigates how LLMs handle emotional content and maintain semantic relationships through continuation and response tasks using three open-source models: Gemma, Llama3 and Llama3.3 and one commercial Model:Claude. By analyzing climate change discussions from Twitter and Reddit, we examine emotional transitions, intensity patterns, and semantic consistency between human-authored and LLM-generated content. Our findings reveal that while both models maintain high semantic coherence, they exhibit distinct emotional patterns: these models show a strong tendency to moderate negative emotions. When the input text carries negative emotions such as anger, disgust, fear, or sadness, LLM tends to generate content with more neutral emotions, or even convert them into positive emotions such as joy or surprise. At the same time, we compared the LLM-generated content with human-authored content. The four models systematically generated responses with reduced emotional intensity and showed a preference for neutral rational emotions in the response task. In addition, these models all maintained a high semantic similarity with the original text, although their performance in the continuation task and the response task was different. These findings provide deep insights into the emotion and semantic processing capabilities of LLM, which are of great significance for its deployment in social media environments and human-computer interaction design.
翻译:大型语言模型(LLMs)在文本生成方面展现出卓越能力,但其在社交媒体语境中的情感一致性与语义连贯性尚未得到充分理解。本研究通过使用三种开源模型(Gemma、Llama3与Llama3.3)及一种商业模型(Claude),探究LLMs如何处理情感内容并保持语义关联性。通过分析来自Twitter和Reddit的气候变化讨论,我们考察了人类撰写内容与LLM生成内容之间的情感转换、强度模式和语义一致性。研究发现:虽然所有模型均保持较高的语义连贯性,但呈现出明显的情感模式——这些模型表现出强烈的负面情绪中和倾向。当输入文本携带愤怒、厌恶、恐惧或悲伤等负面情绪时,LLMs倾向于生成情感更中立的内容,甚至将其转化为喜悦或惊讶等积极情绪。同时,通过对比LLM生成内容与人类撰写内容,我们发现四种模型在响应任务中系统性地生成情感强度降低的回复,并表现出对中性理性情感的偏好。此外,这些模型均保持与原文较高的语义相似度,尽管其在续写任务和响应任务中的表现存在差异。这些发现为理解LLMs的情感与语义处理能力提供了深刻见解,对其在社交媒体环境中的部署及人机交互设计具有重要意义。