Identification of fine-grained location mentions in crisis tweets is central in transforming situational awareness information extracted from social media into actionable information. Most prior works have focused on identifying generic locations, without considering their specific types. To facilitate progress on the fine-grained location identification task, we assemble two tweet crisis datasets and manually annotate them with specific location types. The first dataset contains tweets from a mixed set of crisis events, while the second dataset contains tweets from the global COVID-19 pandemic. We investigate the performance of state-of-the-art deep learning models for sequence tagging on these datasets, in both in-domain and cross-domain settings.
翻译:确定危机推文中提到的细微地点是将从社交媒体中提取的情况认识信息转化为可采取行动的信息的关键。大多数先前的工作都侧重于确定通用地点,而没有考虑其具体类型。为了便利在细微地点识别任务方面取得进展,我们收集了两个推特危机数据集,并手工用具体地点类型来说明它们。第一个数据集包含来自一系列混合危机事件的推文,而第二个数据集包含来自全球COVID-19大流行病的推文。我们调查了在这些数据集上进行序列标记的最新深层次学习模型的表现,无论是在主域还是跨主域设置。