In this paper, we describe our efforts in establishing a simple knowledge base by building a semantic network composed of concepts and word relationships in the context of disasters in the Philippines. Our primary source of data is a collection of news articles scraped from various Philippine news websites. Using word embeddings, we extract semantically similar and co-occurring words from an initial seed words list. We arrive at an expanded ontology with a total of 450 word assertions. We let experts from the fields of linguistics, disasters, and weather science evaluate our knowledge base and arrived at an agreeability rate of 64%. We then perform a time-based analysis of the assertions to identify important semantic changes captured by the knowledge base such as the (a) trend of roles played by human entities, (b) memberships of human entities, and (c) common association of disaster-related words. The context-specific knowledge base developed from this study can be adapted by intelligent agents such as chat bots integrated in platforms such as Facebook Messenger for answering disaster-related queries.
翻译:在本文中,我们描述我们为建立一个简单的知识库所作的努力,方法是建立一个由菲律宾发生灾害时的概念和文字关系组成的语义网络。我们的主要数据来源是收集菲律宾各新闻网站的剪贴文章。我们用文字嵌入列表从最初的种子词列表中提取语义上相似和共同的词句。我们取得了一个扩大的本体学,总共450个字句。我们让语言、灾害和天气科学领域的专家来评估我们的知识库,并达到64%的可接受率。我们随后对这种说法进行基于时间的分析,以确定知识库所捕捉的重要语义变化,例如(a) 人类实体所起作用的趋势,(b) 人类实体的成员,(c) 与灾害有关的词汇的共同联系。从这项研究中开发的具体知识库可以由智能分子加以调整,例如将聊天机器人纳入诸如Facebook使者等平台以回答与灾害有关的问题。