Most humour processing systems to date make at best discrete, coarse-grained distinctions between the comical and the conventional, yet such notions are better conceptualized as a broad spectrum. In this paper, we present a probabilistic approach, a variant of Gaussian process preference learning (GPPL), that learns to rank and rate the humorousness of short texts by exploiting human preference judgments and automatically sourced linguistic annotations. We apply our system, which is similar to one that had previously shown good performance on English-language one-liners annotated with pairwise humorousness annotations, to the Spanish-language data set of the HAHA@IberLEF2019 evaluation campaign. We report system performance for the campaign's two subtasks, humour detection and funniness score prediction, and discuss some issues arising from the conversion between the numeric scores used in the HAHA@IberLEF2019 data and the pairwise judgment annotations required for our method.
翻译:迄今为止,最幽默的处理系统对漫画和传统作品的区分最多是零散的、粗糙的,然而,这些概念的概念在概念上更为广泛。在本文中,我们介绍了一种概率方法,即高山进程偏好学习(GPPL)的变体,即通过利用人类偏好判断和自动提供的语言说明,学习对短文的幽默性进行分级和评分。我们采用的系统类似于以前在英语单行中显示良好表现的系统,在英语单行中附加了幽默性说明,在HAHA@IberLEF2019评价运动的西班牙文数据集中也作了对应的幽默性说明。我们报告了该运动的两个子任务、幽默性探测和趣味评分预测的系统性表现,并讨论了HAHA@IberLEF2019数据与我们方法所需的对称判断说明之间的转换所产生的一些问题。