Semantic typing aims at classifying tokens or spans of interest in a textual context into semantic categories such as relations, entity types, and event types. The inferred labels of semantic categories meaningfully interpret how machines understand components of text. In this paper, we present UniST, a unified framework for semantic typing that captures label semantics by projecting both inputs and labels into a joint semantic embedding space. To formulate different lexical and relational semantic typing tasks as a unified task, we incorporate task descriptions to be jointly encoded with the input, allowing UniST to be adapted to different tasks without introducing task-specific model components. UniST optimizes a margin ranking loss such that the semantic relatedness of the input and labels is reflected from their embedding similarity. Our experiments demonstrate that UniST achieves strong performance across three semantic typing tasks: entity typing, relation classification and event typing. Meanwhile, UniST effectively transfers semantic knowledge of labels and substantially improves generalizability on inferring rarely seen and unseen types. In addition, multiple semantic typing tasks can be jointly trained within the unified framework, leading to a single compact multi-tasking model that performs comparably to dedicated single-task models, while offering even better transferability.
翻译:语义打字的目的是将文字背景中感兴趣的符号或范围分类为语义类别, 如关系、 实体类型和事件类型。 语义分类的推断标签能有意义地解释机器如何理解文本组成部分。 在本文中, 我们提出 UniST 语义打字的统一框架, 通过将输入和标签投射到联合语义嵌入空间来捕捉语义打字。 要将不同的词汇和关联语义打字任务作为一个统一的任务来制定不同的词汇和关系语义打字任务, 我们将任务描述与输入一同编码, 允许 UniST 适应不同的任务, 而不引入特定任务模式组成部分 。 UniST 优化差值排序损失, 使输入和标签的语义关联性能从其嵌入相似性中得到反映。 我们的实验表明 UniST 在三个语义打字任务( 实体打字、 关系分类和事件打字) 中取得了很强的性能。 同时, UniST 有效地传输对标签的语义和关联性打字知识, 大大改进了对罕见和看不见类型的推论的通用性。 此外, 多种语义打字模式, 甚至多语义打字模式, 可以在共同训练一个单一、 多式打字任务在单一的、 上进行更稳定的打字式、 提供一个更稳定的打字式的、 的、 的、 提供更好的、 多式的、 多式、 提供更好的转。