Linguistics has been instrumental in developing a deeper understanding of human nature. Words are indispensable to bequeath the thoughts, emotions, and purpose of any human interaction, and critically analyzing these words can elucidate the social and psychological behavior and characteristics of these social animals. Social media has become a platform for human interaction on a large scale and thus gives us scope for collecting and using that data for our study. However, this entire process of collecting, labeling, and analyzing this data iteratively makes the entire procedure cumbersome. To make this entire process easier and structured, we would like to introduce TLA(Twitter Linguistic Analysis). In this paper, we describe TLA and provide a basic understanding of the framework and discuss the process of collecting, labeling, and analyzing data from Twitter for a corpus of languages while providing detailed labeled datasets for all the languages and the models are trained on these datasets. The analysis provided by TLA will also go a long way in understanding the sentiments of different linguistic communities and come up with new and innovative solutions for their problems based on the analysis.
翻译:语言有助于加深对人性的了解。语言对于传承任何人类互动的思想、情感和目的是必不可少的。语言对于传承任何人类互动的思想、情感和目的是不可或缺的,批判性地分析这些词汇可以阐明这些社会动物的社会和心理行为和特征。社交媒体已成为人类大规模互动的平台,因此使我们有机会收集和使用这些数据用于我们的研究。然而,收集、标签和分析这些数据的整个过程迭代地使整个程序变得繁琐。为了使整个过程更加容易和结构化,我们希望介绍TLA(Twitter语言分析 ) 。在本文件中,我们描述TLA(TLA),提供对框架的基本理解,并讨论从Twitter收集、标签和分析各种语言数据的过程,同时提供所有语言的详细标签数据集,模型也接受关于这些数据集的培训。TLA(TLA)提供的分析也将在理解不同语言社群的情感和根据分析为它们的问题找到新的创新解决办法方面,将有很大进展。