Transgender community is experiencing a huge disparity in mental health conditions compared with the general population. Interpreting the social medial data posted by transgender people may help us understand the sentiments of these sexual minority groups better and apply early interventions. In this study, we manually categorize 300 social media comments posted by transgender people to the sentiment of negative, positive, and neutral. 5 machine learning algorithms and 2 deep neural networks are adopted to build sentiment analysis classifiers based on the annotated data. Results show that our annotations are reliable with a high Cohen's Kappa score over 0.8 across all three classes. LSTM model yields an optimal performance of accuracy over 0.85 and AUC of 0.876. Our next step will focus on using advanced natural language processing algorithms on a larger annotated dataset.
翻译:与一般人口相比,跨性别社区在心理健康条件方面正经历着巨大的差异; 解读变性人提供的社交媒体数据可能有助于我们更好地了解这些性少数群体的情绪,并采用早期干预措施; 这项研究将变性人提供的300份社交媒体评论分类为消极、积极和中立的情绪; 采用了5个机器学习算法和2个深层神经网络,以根据附加说明的数据建立情绪分析分类; 结果表明,我们的说明是可靠的,科恩的卡帕得分超过0.8,在所有三个班级中都很高。 LSTM模型的准确性表现优于0.85和0.876的AUC。 我们的下一步将侧重于在更大的附加说明数据集中使用先进的自然语言处理算法。