Slang is a common type of informal language, but its flexible nature and paucity of data resources present challenges for existing natural language systems. We take an initial step toward machine generation of slang by developing a framework that models the speaker's word choice in slang context. Our framework encodes novel slang meaning by relating the conventional and slang senses of a word while incorporating syntactic and contextual knowledge in slang usage. We construct the framework using a combination of probabilistic inference and neural contrastive learning. We perform rigorous evaluations on three slang dictionaries and show that our approach not only outperforms state-of-the-art language models, but also better predicts the historical emergence of slang word usages from 1960s to 2000s. We interpret the proposed models and find that the contrastively learned semantic space is sensitive to the similarities between slang and conventional senses of words. Our work creates opportunities for the automated generation and interpretation of informal language.
翻译:兰格是一种常见的非正规语言,但其灵活性和缺乏数据资源对现有的自然语言系统提出了挑战。我们最初迈出了一步,通过开发一个框架来模拟语言选择在语言背景中的模式,将一个词的常规和 sang 感知连接起来,将一个词的常规和 sang 感知和语境知识结合到语言使用中,从而将新奇的 sang 意义编码起来。我们用概率推论和神经反向学习相结合的方式构建了这个框架。我们对三个词典进行了严格的评价,并表明我们的方法不仅优于最先进的语言模式,而且更好地预测了1960年代到2000年代在历史上出现的语义使用。我们解释拟议的模式,发现不同学得的语义空间对语义和传统感之间的相似性十分敏感。我们的工作为自动生成和解释非正式语言创造了机会。