Humor is an important social phenomenon, serving complex social and psychological functions. However, despite being studied for millennia humor is computationally not well understood, often considered an AI-complete problem. In this work, we introduce a novel setting in humor mining: automatically detecting funny and unusual scientific papers. We are inspired by the Ig Nobel prize, a satirical prize awarded annually to celebrate funny scientific achievements (example past winner: "Are cows more likely to lie down the longer they stand?"). This challenging task has unique characteristics that make it particularly suitable for automatic learning. We construct a dataset containing thousands of funny papers and use it to learn classifiers, combining findings from psychology and linguistics with recent advances in NLP. We use our models to identify potentially funny papers in a large dataset of over 630,000 articles. The results demonstrate the potential of our methods, and more broadly the utility of integrating state-of-the-art NLP methods with insights from more traditional disciplines.
翻译:幽默是一种重要的社会现象,具有复杂的社会和心理功能。然而,尽管对千百年的幽默进行了研究,但是在计算上并没有很好地理解,常常被认为是一个完整的AI问题。在这项工作中,我们引入了幽默挖掘的新环境:自动发现有趣的和不寻常的科学论文。我们受到Ig 诺贝尔奖的启发,这是一个每年为庆祝有趣的科学成就而颁发的讽刺奖(例如,过去的胜利者“牛更可能躺得越久?” )。这项具有挑战性的任务具有独特性的特点,使得它特别适合自动学习。我们建立了一个数据集,包含数千篇有趣的论文,并用来学习分类学,将心理学和语言学的研究结果与NLP的最新进展结合起来。我们利用我们的模型在63万多篇文章的大型数据集中识别潜在有趣的论文。 其结果显示了我们的方法的潜力,以及将最先进的NLP方法与较传统的学科的洞察力结合起来的效用。