Point-of-interest (POI) type prediction is the task of inferring the type of a place from where a social media post was shared. Inferring a POI's type is useful for studies in computational social science including sociolinguistics, geosemiotics, and cultural geography, and has applications in geosocial networking technologies such as recommendation and visualization systems. Prior efforts in POI type prediction focus solely on text, without taking visual information into account. However in reality, the variety of modalities, as well as their semiotic relationships with one another, shape communication and interactions in social media. This paper presents a study on POI type prediction using multimodal information from text and images available at posting time. For that purpose, we enrich a currently available data set for POI type prediction with the images that accompany the text messages. Our proposed method extracts relevant information from each modality to effectively capture interactions between text and image achieving a macro F1 of 47.21 across eight categories significantly outperforming the state-of-the-art method for POI type prediction based on text-only methods. Finally, we provide a detailed analysis to shed light on cross-modal interactions and the limitations of our best performing model.
翻译:利益点类型预测(POI)是一种任务,即从社交媒体职位共享的地点来推断某类地点,推断POI的类型对计算社会学的研究有用,包括社会语言学、地理学和文化地理学,并应用于地理社会网络技术,如建议和可视化系统。POI先前的预测工作仅侧重于文本,而没有考虑视觉信息。然而,在现实中,模式的种类及其相互之间的原始关系,影响社交媒体的沟通和互动。本文介绍了利用在张贴时可获得的文本和图像提供的多式信息进行POI类型的预测的研究。为此,我们用文本信息中的图像丰富了目前可供POI类型预测的数据集。我们提议的方法从每种模式中提取相关信息,以有效捕捉文本和图像之间的相互作用,从而实现47.21的宏观F1,横跨八类,大大超过了基于文本方法的POI类型预测的最新方法。我们提供了详细的模型分析,以揭示我们的最佳互动和限制。