Being a matter of cognition, user interests should be apt to classification independent of the language of users, social network and content of interest itself. To prove it, we analyze a collection of English and Russian Twitter and Vkontakte community pages by interests of their followers. First, we create a model of Major Interests (MaIs) with the help of expert analysis and then classify a set of pages using machine learning algorithms (SVM, Neural Network, Naive Bayes, and some other). We take three interest domains that are typical of both English and Russian-speaking communities: football, rock music, vegetarianism. The results of classification show a greater correlation between Russian-Vkontakte and Russian-Twitter pages while English-Twitterpages appear to provide the highest score.
翻译:作为认知问题,用户兴趣应该能够脱离用户的语言、社交网络和感兴趣的内容本身进行分类。为了证明这一点,我们按其追随者的利益对英文和俄文推特和Vkontakte社区网页的汇编进行分析。首先,我们在专家分析的帮助下创建了主要利益模式,然后使用机器学习算法(SVM、神经网络、Nive Bayes 和其他部分)对一组网页进行分类。我们采用了英语和俄语社区典型的三个利益领域:足球、摇滚音乐、素食主义。分类结果显示俄罗斯-Vkontakte和俄语-Twitter网页之间的相关性更大,而英语-Twitter网页似乎提供了最高分数。