分类学是分类的实践和科学。Wikipedia类别说明了一种分类法,可以通过自动方式提取Wikipedia类别的完整分类法。截至2009年,已经证明,可以使用人工构建的分类法(例如像WordNet这样的计算词典的分类法)来改进和重组Wikipedia类别分类法。 从广义上讲,分类法还适用于除父子层次结构以外的关系方案,例如网络结构。然后分类法可能包括有多父母的单身孩子,例如,“汽车”可能与父母双方一起出现“车辆”和“钢结构”;但是对某些人而言,这仅意味着“汽车”是几种不同分类法的一部分。分类法也可能只是将事物组织成组,或者是按字母顺序排列的列表;但是在这里,术语词汇更合适。在知识管理中的当前用法中,分类法被认为比本体论窄,因为本体论应用了各种各样的关系类型。 在数学上,分层分类法是给定对象集的分类树结构。该结构的顶部是适用于所有对象的单个分类,即根节点。此根下的节点是更具体的分类,适用于总分类对象集的子集。推理的进展从一般到更具体。

知识荟萃

Taxonomy 专知荟萃

入门学习

实体抽取

关系预测

数据集/共享任务

  • SemEval-2015 Task 17: Taxonomy Extraction Evaluation (TExEval-1), Home, Report
  • SemEval-2016 Task 13: Taxonomy Extraction Evaluation (TExEval-2), Home, Report
  • SemEval-2016 Task 14: Semantic Taxonomy Enrichment, Home, Report
  • SemEval-2018 Task 9: Hypernym Discovery, Home, Report
  • UnsupervisedHypernymy, Home, EACL 2017 paper, including 4 datasets:
    • (Hypernymy Detection): EVAL, BLESS, LEDS (a.k.a Lenci/Benotto), Weeds
  • HypernymySuite, Home, ACL 2018 paper, including (somewhat modified) datasets:
    • (Hypernymy Detection): BLESS, LEDS, EVAL, SHWARTZ, WBLESS
    • (Hypernymy Direction): BLESS, WBLESS, BIBLESS
    • (Graded Entailment): HyperLex
  • NCBI Taxonomy Harvest
  • GBIF Backbone Taxonomy

Tutorial

  1. KDD 2019 tutorial Constructing and Mining Heterogeneous Information Networks from Massive Text
  2. VLDB 2019 Tutorial TextCube: Automated Construction and Multidimensional Exploration

论文

综述&报告

Hypernymy Discovery & Lexical Entailment

基于实例的分类构建

基于聚类的分类构建

分类树扩展

Taxonomy 应用

联合分类的构建与应用

领域专家

1、韩家炜,美国伊利诺伊大学香槟分校计算机系教授,IEEE和ACM院士,美国信息网络学术研究中心主任。曾担任KDD、SDM和ICDM等国际知名会议的程序委员会主席,创办了ACM TKDD学报并任主编。在数据挖掘、数据库和信息网络领域发表论文600余篇。韩家炜主页 韩家炜dblp

VIP内容

题目: Octet: Online Catalog Taxonomy Enrichment with Self-Supervision

简介:

分类法在各个领域都有广泛的应用,特别是在在线项目分类、浏览和搜索方面。尽管在线目录分类法的使用很普遍,但实际上大多数分类法都是由人类维护的,这是劳动密集型的,难以扩展。虽然从零开始的分类学构建在文献中得到了大量的研究,但是如何有效地丰富现有的不完全分类学仍然是一个开放而重要的研究问题。分类法的丰富性不仅要求对新出现的术语具有健壮性,而且要求现有分类法结构与新术语附件之间的一致性。在本文中,我们提出了一个自我监督的端到端框架Octet,用于在线目录分类法的丰富。Octet利用联机目录分类法独有的异构信息,例如用户查询、项及其与分类法节点的关系,而不需要除现有分类法以外的其他监督。提出了一种用于术语提取的序列标记模型,并利用图神经网络(GNNs)来捕获术语连接的分类结构和查询项分类交互。在不同的在线领域进行的大量实验表明,通过自动和人工评估,Octet方法优于最新的方法。值得注意的是,Octet丰富了生产中的在线目录分类法,使其在开放世界评估中的规模增加了2倍。

成为VIP会员查看完整内容
0
13

热门内容

The quest of `can machines think' and `can machines do what human do' are quests that drive the development of artificial intelligence. Although recent artificial intelligence succeeds in many data intensive applications, it still lacks the ability of learning from limited exemplars and fast generalizing to new tasks. To tackle this problem, one has to turn to machine learning, which supports the scientific study of artificial intelligence. Particularly, a machine learning problem called Few-Shot Learning (FSL) targets at this case. It can rapidly generalize to new tasks of limited supervised experience by turning to prior knowledge, which mimics human's ability to acquire knowledge from few examples through generalization and analogy. It has been seen as a test-bed for real artificial intelligence, a way to reduce laborious data gathering and computationally costly training, and antidote for rare cases learning. With extensive works on FSL emerging, we give a comprehensive survey for it. We first give the formal definition for FSL. Then we point out the core issues of FSL, which turns the problem from "how to solve FSL" to "how to deal with the core issues". Accordingly, existing works from the birth of FSL to the most recent published ones are categorized in a unified taxonomy, with thorough discussion of the pros and cons for different categories. Finally, we envision possible future directions for FSL in terms of problem setup, techniques, applications and theory, hoping to provide insights to both beginners and experienced researchers.

0
332
下载
预览

最新内容

Concept drift in process mining (PM) is a challenge as classical methods assume processes are in a steady-state, i.e., events share the same process version. We conducted a systematic literature review on the intersection of these areas, and thus, we review concept drift in process mining and bring forward a taxonomy of existing techniques for drift detection and online process mining for evolving environments. Existing works depict that (i) PM still primarily focuses on offline analysis, and (ii) the assessment of concept drift techniques in processes is cumbersome due to the lack of common evaluation protocol, datasets, and metrics.

0
0
下载
预览

最新论文

Concept drift in process mining (PM) is a challenge as classical methods assume processes are in a steady-state, i.e., events share the same process version. We conducted a systematic literature review on the intersection of these areas, and thus, we review concept drift in process mining and bring forward a taxonomy of existing techniques for drift detection and online process mining for evolving environments. Existing works depict that (i) PM still primarily focuses on offline analysis, and (ii) the assessment of concept drift techniques in processes is cumbersome due to the lack of common evaluation protocol, datasets, and metrics.

0
0
下载
预览
Top