Recent studies show that, despite being effective on numerous tasks, text processing algorithms may be vulnerable to deliberate attacks. However, the question of whether such weaknesses can directly lead to security threats is still under-explored. To bridge this gap, we conducted vulnerability tests on Text-to-SQL, a technique that builds natural language interfaces for databases. Empirically, we showed that the Text-to-SQL modules of two commercial black boxes (Baidu-UNIT and Codex-powered Ai2sql) can be manipulated to produce malicious code, potentially leading to data breaches and Denial of Service. This is the first demonstration of the danger of NLP models being exploited as attack vectors in the wild. Moreover, experiments involving four open-source frameworks verified that simple backdoor attacks can achieve a 100% success rate on Text-to-SQL systems with almost no prediction performance impact. By reporting these findings and suggesting practical defences, we call for immediate attention from the NLP community to the identification and remediation of software security issues.
翻译:最近的研究显示,尽管文本处理算法对许多任务有效,但文本处理算法可能很容易受到蓄意攻击;然而,这种弱点能否直接导致安全威胁的问题仍未得到充分探讨;为弥合这一差距,我们对文本到SQL进行了脆弱性测试,这是为数据库建立自然语言界面的一种技术;我们很生动地表明,两个商业黑盒(Baidu-UNIT和Coolx动力Ai2sql)的文本到SQL模块可以被操纵产生恶意代码,可能导致数据被破坏和拒绝提供服务。这是第一次证明NLP模型在野外被用作攻击矢量的危险。此外,涉及四个公开源框架的实验证实,简单的后门攻击可以在文本到SQL系统上达到100%的成功率,几乎没有预测性能影响。我们通过报告这些结果并提出实际防御,我们呼吁NLP社区立即关注软件安全问题的识别和补救。