从地理本地化的Twitter图示中生成虚假信息透视 (Deriving Disinformation Insights from Geolocalized Twitter Callouts) - 专知论文

会员服务 ·

0

Twitter · 词向量表示 · 语言模型化 · BERT · MoDELS ·

2021 年 8 月 6 日

Deriving Disinformation Insights from Geolocalized Twitter Callouts

翻译：从地理本地化的Twitter图示中生成虚假信息透视

David Tuxworth,Dimosthenis Antypas,Luis Espinosa-Anke,Jose Camacho-Collados,Alun Preece,David Rogers

from arxiv, Accepted for presentation at KDD 2021 - Workshop On Deriving Insights From User-Generated Text

This paper demonstrates a two-stage method for deriving insights from social media data relating to disinformation by applying a combination of geospatial classification and embedding-based language modelling across multiple languages. In particular, the analysis in centered on Twitter and disinformation for three European languages: English, French and Spanish. Firstly, Twitter data is classified into European and non-European sets using BERT. Secondly, Word2vec is applied to the classified texts resulting in Eurocentric, non-Eurocentric and global representations of the data for the three target languages. This comparative analysis demonstrates not only the efficacy of the classification method but also highlights geographic, temporal and linguistic differences in the disinformation-related media. Thus, the contributions of the work are threefold: (i) a novel language-independent transformer-based geolocation method; (ii) an analytical approach that exploits lexical specificity and word embeddings to interrogate user-generated content; and (iii) a dataset of 36 million disinformation related tweets in English, French and Spanish.

翻译：本文展示了从社交媒体数据中获取与虚假信息有关的洞察力的两阶段方法,即结合多种语言的地理空间分类和嵌入语言建模,特别是以Twitter为中心的分析以及三种欧洲语言(英语、法语和西班牙语)的假信息。首先,Twitter数据使用BERT分类为欧洲和非欧洲数据集。第二,Word2vec应用到导致三种目标语言的数据以欧洲为中心、非欧元为中心和全球表示的分类文本。这一比较分析不仅表明分类方法的功效,而且突出与错误信息有关的媒体的地理、时间和语言差异。因此,这项工作的贡献有三重:(一) 新的基于语言的变异器地理定位方法;(二) 利用词汇特性和文字嵌入的语言来询问用户生成的内容的分析方法;以及(三) 以英文、法文和西班牙文提供的3 600万条不真实的推文数据集。

0

相关内容

Twitter（推特）是一个社交网络及微博客服务的网站。它利用无线网络，有线网络，通信技术，进行即时通讯，是微博客的典型应用。

CIKM 2021 | FKGE：差分隐私的联邦知识图谱嵌入

专知会员服务

22+阅读 · 2021年8月20日

自然语言处理顶会COLING2020最佳论文出炉！

自然语言处理顶会COLING2020最佳论文出炉！

专知会员服务

24+阅读 · 2020年12月12日

IJCAI2020接受论文列表，592篇论文pdf都在这了！

IJCAI2020接受论文列表，592篇论文pdf都在这了！

专知会员服务

64+阅读 · 2020年7月16日

【Facebook AI】低资源机器翻译，74页ppt

【Facebook AI】低资源机器翻译，74页ppt

专知会员服务

30+阅读 · 2020年4月8日

【论文】使用编码器进行命名实体识别（TENER: Adapting Transformer Encoder for Named Entity Recognition）

【论文】使用编码器进行命名实体识别（TENER: Adapting Transformer Encoder for Named Entity Recognition）

专知会员服务

52+阅读 · 2019年12月28日

【CCL 2019】表示学习--自然语言处理中的图神经网络（Graph Neural Networks in NLP），西湖大学长聘副教授张岳

【CCL 2019】表示学习--自然语言处理中的图神经网络（Graph Neural Networks in NLP），西湖大学长聘副教授张岳

专知会员服务

64+阅读 · 2019年11月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

已删除

将门创投

14+阅读 · 2019年5月29日

计算机 | EMNLP 2019等国际会议信息6条

计算机 | EMNLP 2019等国际会议信息6条

Call4Papers

18+阅读 · 2019年4月26日

计算机 | USENIX Security 2020等国际会议信息5条

计算机 | USENIX Security 2020等国际会议信息5条

Call4Papers

7+阅读 · 2019年4月25日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

人工智能 | SCI期刊专刊信息3条

人工智能 | SCI期刊专刊信息3条

Call4Papers

5+阅读 · 2019年1月10日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

自然语言处理顶会EMNLP2018接受论文列表！

自然语言处理顶会EMNLP2018接受论文列表！

专知

87+阅读 · 2018年8月26日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

Application of the interactive Leipzig Corpus Miner as a generic research platform for the use in the social sciences

Arxiv

0+阅读 · 2021年10月6日

Digital Divide and Social Dilemma of Privacy Preservation

Arxiv

0+阅读 · 2021年10月6日

Protagonists' Tagger in Literary Domain -- New Datasets and a Method for Person Entity Linkage

Arxiv

0+阅读 · 2021年10月4日

Kleister: Key Information Extraction Datasets Involving Long Documents with Complex Layouts

Kleister: Key Information Extraction Datasets Involving Long Documents with Complex Layouts

Arxiv

3+阅读 · 2021年5月12日

Rumor Detection on Social Media with Bi-Directional Graph Convolutional Networks

Rumor Detection on Social Media with Bi-Directional Graph Convolutional Networks

Arxiv

4+阅读 · 2020年1月17日

End-to-End Text Classification via Image-based Embedding using Character-level Networks

End-to-End Text Classification via Image-based Embedding using Character-level Networks

Arxiv

5+阅读 · 2018年10月10日

Leveraging Long and Short-term Information in Content-aware Movie Recommendation

Arxiv

8+阅读 · 2018年5月2日

Topic Modelling of Everyday Sexism Project Entries

Arxiv

3+阅读 · 2018年4月5日

Open Information Extraction on Scientific Text: An Evaluation

Arxiv

6+阅读 · 2018年2月15日

Cross-lingual Entity Alignment via Joint Attribute-Preserving Embedding

Arxiv

3+阅读 · 2017年9月26日

VIP会员

文章信息

相关主题

词向量表示

语言模型化

相关VIP内容

CIKM 2021 | FKGE：差分隐私的联邦知识图谱嵌入

专知会员服务

22+阅读 · 2021年8月20日

自然语言处理顶会COLING2020最佳论文出炉！

自然语言处理顶会COLING2020最佳论文出炉！

专知会员服务

24+阅读 · 2020年12月12日

IJCAI2020接受论文列表，592篇论文pdf都在这了！

IJCAI2020接受论文列表，592篇论文pdf都在这了！

专知会员服务

64+阅读 · 2020年7月16日

【Facebook AI】低资源机器翻译，74页ppt

【Facebook AI】低资源机器翻译，74页ppt

专知会员服务

30+阅读 · 2020年4月8日

【论文】使用编码器进行命名实体识别（TENER: Adapting Transformer Encoder for Named Entity Recognition）

【论文】使用编码器进行命名实体识别（TENER: Adapting Transformer Encoder for Named Entity Recognition）

专知会员服务

52+阅读 · 2019年12月28日

【CCL 2019】表示学习--自然语言处理中的图神经网络（Graph Neural Networks in NLP），西湖大学长聘副教授张岳

【CCL 2019】表示学习--自然语言处理中的图神经网络（Graph Neural Networks in NLP），西湖大学长聘副教授张岳

专知会员服务

64+阅读 · 2019年11月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

大语言模型幻觉：系统综述

《分析与预测陆军战斗体能测试表现：统计与机器学习方法》2025最新137页

【博士论文】数据与任务的物理学：深度学习中的局部性与组合性理论

代理式人工智能时代的决策优势

相关资讯

已删除

将门创投

14+阅读 · 2019年5月29日

计算机 | EMNLP 2019等国际会议信息6条

计算机 | EMNLP 2019等国际会议信息6条

Call4Papers

18+阅读 · 2019年4月26日

计算机 | USENIX Security 2020等国际会议信息5条

计算机 | USENIX Security 2020等国际会议信息5条

Call4Papers

7+阅读 · 2019年4月25日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

人工智能 | SCI期刊专刊信息3条

人工智能 | SCI期刊专刊信息3条

Call4Papers

5+阅读 · 2019年1月10日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

自然语言处理顶会EMNLP2018接受论文列表！

自然语言处理顶会EMNLP2018接受论文列表！

专知

87+阅读 · 2018年8月26日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

相关论文

Application of the interactive Leipzig Corpus Miner as a generic research platform for the use in the social sciences

Arxiv

0+阅读 · 2021年10月6日

Digital Divide and Social Dilemma of Privacy Preservation

Arxiv

0+阅读 · 2021年10月6日

Protagonists' Tagger in Literary Domain -- New Datasets and a Method for Person Entity Linkage

Arxiv

0+阅读 · 2021年10月4日

Kleister: Key Information Extraction Datasets Involving Long Documents with Complex Layouts

Kleister: Key Information Extraction Datasets Involving Long Documents with Complex Layouts

Arxiv

3+阅读 · 2021年5月12日

Rumor Detection on Social Media with Bi-Directional Graph Convolutional Networks

Rumor Detection on Social Media with Bi-Directional Graph Convolutional Networks

Arxiv

4+阅读 · 2020年1月17日

End-to-End Text Classification via Image-based Embedding using Character-level Networks

End-to-End Text Classification via Image-based Embedding using Character-level Networks

Arxiv

5+阅读 · 2018年10月10日

Leveraging Long and Short-term Information in Content-aware Movie Recommendation

Arxiv

8+阅读 · 2018年5月2日

Topic Modelling of Everyday Sexism Project Entries

Arxiv

3+阅读 · 2018年4月5日

Open Information Extraction on Scientific Text: An Evaluation

Arxiv

6+阅读 · 2018年2月15日

Cross-lingual Entity Alignment via Joint Attribute-Preserving Embedding

Arxiv

3+阅读 · 2017年9月26日

微信扫码咨询专知VIP会员