This paper compares BERT-SQuAD and Ab3P on the Abbreviation Definition Identification (ADI) task. ADI inputs a text and outputs short forms (abbreviations/acronyms) and long forms (expansions). BERT with reranking improves over BERT without reranking but fails to reach the Ab3P rule-based baseline. What is BERT missing? Reranking introduces two new features: charmatch and freq. The first feature identifies opportunities to take advantage of character constraints in acronyms and the second feature identifies opportunities to take advantage of frequency constraints across documents.
翻译:本文件比较了BERT-SQuAD和AB3P关于缩略定义识别(ADI)的任务,ADI输入了文本和产出短格式(缩略语/缩略语)和长格式(缩略语)和长格式(缩略语)。BERT的重新排序比BERT改进了,没有重新排行,但没有达到AB3P规则基准。BERT缺少什么?重新排序引入了两个新特征:Crappatch和freq。第一个特征确定了利用缩略语字符限制的机会,第二个特征确定了利用不同文件频度限制的机会。