Eating disorders (ED), a severe mental health condition with high rates of mortality and morbidity, affect millions of people globally, especially adolescents. The proliferation of online communities that promote and normalize ED has been linked to this public health crisis. However, identifying harmful communities is challenging due to the use of coded language and other obfuscations. To address this challenge, we propose a novel framework to surface implicit attitudes of online communities by adapting large language models (LLMs) to the language of the community. We describe an alignment method and evaluate results along multiple dimensions of semantics and affect. We then use the community-aligned LLM to respond to psychometric questionnaires designed to identify ED in individuals. We demonstrate that LLMs can effectively adopt community-specific perspectives and reveal significant variations in eating disorder risks in different online communities. These findings highlight the utility of LLMs to reveal implicit attitudes and collective mindsets of communities, offering new tools for mitigating harmful content on social media.
翻译:暂无翻译