The robustness of Text-to-SQL parsers against adversarial perturbations plays a crucial role in delivering highly reliable applications. Previous studies along this line primarily focused on perturbations in the natural language question side, neglecting the variability of tables. Motivated by this, we propose the Adversarial Table Perturbation (ATP) as a new attacking paradigm to measure the robustness of Text-to-SQL models. Following this proposition, we curate ADVETA, the first robustness evaluation benchmark featuring natural and realistic ATPs. All tested state-of-the-art models experience dramatic performance drops on ADVETA, revealing models' vulnerability in real-world practices. To defend against ATP, we build a systematic adversarial training example generation framework tailored for better contextualization of tabular data. Experiments show that our approach not only brings the best robustness improvement against table-side perturbations but also substantially empowers models against NL-side perturbations. We release our benchmark and code at: https://github.com/microsoft/ContextualSP.
翻译:文本到 SQL 分析器的稳健性在提供高度可靠的应用方面起着关键作用。 以往沿这条线进行的研究主要侧重于自然语言问题方面的扰动,忽略了表格的变异性。 受此驱动,我们提议将反对表扰动(ATP)作为衡量文本到SQL 模型的稳健性的新攻击范式。 遵循这一主张,我们调整了ADVETA,这是第一个以自然和现实的ATP为特点的稳健性评价基准。 所有测试过的先进模型都经历了ADVETA的急剧性能下降,揭示了模型在现实世界实践中的脆弱性。为了防御ATP,我们建立了一个系统对抗性培训示范生成框架,为表格数据更好的背景化而专门设计。 实验表明,我们的方法不仅带来与表侧扰动相比最强的稳健性改进,而且大大增强了NL 侧扰动性模型的能力。 我们发布了我们的基准和代码: https://github. com/microcrysoft/ContrualSP。