Encoder-only transformer models have been successfully applied to different table understanding tasks, as in TAPAS (Herzig et al., 2020). A major limitation of these architectures is that they are constrained to classification-like tasks such as cell selection or entailment detection. We present TABT5, an encoder-decoder model that generates natural language text based on tables and textual inputs. TABT5 overcomes the encoder-only limitation by incorporating a decoder component and leverages the input structure with table specific embeddings and pre-training. TABT5 achieves new state-of-the-art results on several domains, including spreadsheet formula prediction with a 15% increase in sequence accuracy, QA with a 2.5% increase in sequence accuracy and data-to-text generation with a 2.5% increase in BLEU.
翻译:如TAPAS(Herzig等人,2020年)一样,只有编码器的变压器模型已成功地应用于不同的表格理解任务。这些结构的主要局限性是,这些结构受制于类似分类的任务,如细胞选择或诱导检测。我们介绍了TABT5, 一种根据表格和文字输入生成自然语言文字的编码器解码器模型。TABT5通过纳入解码器组件,并利用特定表格嵌入和预培训来利用输入结构,克服了仅对编码器的限制。TABT5在一些领域取得了新的最新结果,包括电子表格公式预测,序列精度增加了15%,质量A增加了2.5%,序列精度和数据到文字生成增加了2.5%,BLEU增加了2.5%。