内地普通常识推断法的对地变式语言模型 (Adversarial Transformer Language Models for Contextual Commonsense Inference)

from arxiv, Submitted to Semantic Web Journal special edition. https://semantic-web-journal.org/content/adversarial-transformer-language-models-contextual-commonsense-inference-1

Contextualized or discourse aware commonsense inference is the task of generating coherent commonsense assertions (i.e., facts) from a given story, and a particular sentence from that story. Some problems with the task are: lack of controllability for topics of the inferred facts; lack of commonsense knowledge during training; and, possibly, hallucinated or false facts. In this work, we utilize a transformer model for this task and develop techniques to address the aforementioned problems in the task. We control the inference by introducing a new technique we call "hinting". Hinting is a kind of language model prompting, that utilizes both hard prompts (specific words) and soft prompts (virtual learnable templates). This serves as a control signal to advise the language model "what to talk about". Next, we establish a methodology for performing joint inference with multiple commonsense knowledge bases. Joint inference of commonsense requires care, because it is imprecise and the level of generality is more flexible. You want to be sure that the results "still make sense" for the context. To this end, we align the textual version of assertions from three knowledge graphs (ConceptNet, ATOMIC2020, and GLUCOSE) with a story and a target sentence. This combination allows us to train a single model to perform joint inference with multiple knowledge graphs. We show experimental results for the three knowledge graphs on joint inference. Our final contribution is exploring a GAN architecture that generates the contextualized commonsense assertions and scores them as to their plausibility through a discriminator. The result is an integrated system for contextual commonsense inference in stories, that can controllably generate plausible commonsense assertions, and takes advantage of joint inference between multiple commonsense knowledge bases.

翻译：环境背景或讨论意识到常识推理, 任务是从某个故事中产生一致的常识主张( 即事实), 以及该故事中的特殊句子。任务中的一些问题是: 推断事实的主题缺乏可控性; 培训期间缺乏常识知识; 可能还有幻觉或虚假事实。在这项工作中, 我们为这项任务使用变压器模型, 并开发技术来解决任务中的上述问题。我们通过引入我们称之为“ 吸入” 的新技术来控制推论。渴望是一种语言模型, 是一种煽动性的语言模型, 使用硬性提示( 具体词) 和软性提示( 虚拟学习普通模板) 。任务中的一些问题有: 缺乏可控性; 缺乏常识性知识; 使用常识模型, 开发常识性模型, 比较一般性水平更灵活。您想要确定, 在背景中“ 保持感知性” 语言模型, 使用直线( 具体语言) 和软性提示( 虚拟学习模板) 的直径, 我们通过GL 版本的常识判, 显示一个常识的常识, 在图形中, 直径系统中, 我们通过直译的常识中, 显示一个常识, 常识的常识, 将GLULU 的常识, 可以显示一个常识。