Social media plays a significant role in disaster management by providing valuable data about affected people, donations and help requests. Recent studies highlight the need to filter information on social media into fine-grained content labels. However, identifying useful information from massive amounts of social media posts during a crisis is a challenging task. In this paper, we propose I-AID, a multimodel approach to automatically categorize tweets into multi-label information types and filter critical information from the enormous volume of social media data. I-AID incorporates three main components: i) a BERT-based encoder to capture the semantics of a tweet and represent as a low-dimensional vector, ii) a graph attention network (GAT) to apprehend correlations between tweets' words/entities and the corresponding information types, and iii) a Relation Network as a learnable distance metric to compute the similarity between tweets and their corresponding information types in a supervised way. We conducted several experiments on two real publicly-available datasets. Our results indicate that I-AID outperforms state-of-the-art approaches in terms of weighted average F1 score by +6% and +4% on the TREC-IS dataset and COVID-19 Tweets, respectively.
翻译:社会媒体在灾害管理中起着重要作用,通过提供有关受影响者、捐款和求助请求的宝贵数据。最近的研究强调,需要将社会媒体信息过滤到精细的标签中。然而,从危机期间大量社交媒体文章中找出有用信息是一项艰巨的任务。在本文件中,我们建议I-AID,这是将推特自动分类为多标签信息类型和从大量社交媒体数据中筛选重要信息的多模式方法。I-AID包含三个主要组成部分:i)基于BERT的编码器,以捕捉推文的语义,并代表一个低维矢量的矢量。ii)一个图形关注网络,以掌握推特的言词/实体与相应信息类型之间的关联。iii) 关系网,作为一个可学习的距离指标,以受监督的方式将推特与相应信息类型相类似。我们在两个真正公开的数据集上进行了几次实验。我们的结果显示,I-AID在加权平均F1+6%和CRES +4数据中分别显示T1+6%和CIRS-VIS ++4的加权平均FIS数据评分数方面的最新方法。