CAPTCHA (Completely Automated Public Truing test to tell Computers and Humans Apart) is a widely used technology to distinguish real users and automated users such as bots. However, the advance of AI technologies weakens many CAPTCHA tests and can induce security concerns. In this paper, we propose a user-friendly text-based CAPTCHA generation method named Robust Text CAPTCHA (RTC). At the first stage, the foregrounds and backgrounds are constructed with randomly sampled font and background images, which are then synthesized into identifiable pseudo adversarial CAPTCHAs. At the second stage, we design and apply a highly transferable adversarial attack for text CAPTCHAs to better obstruct CAPTCHA solvers. Our experiments cover comprehensive models including shallow models such as KNN, SVM and random forest, various deep neural networks and OCR models. Experiments show that our CAPTCHAs have a failure rate lower than one millionth in general and high usability. They are also robust against various defensive techniques that attackers may employ, including adversarial training, data pre-processing and manual tagging.
翻译:CAPTCHA(完整自动化公交测试,以告诉计算机和人类,Apart)是一种广泛使用的技术,用来区分真实用户和机器人等自动化用户。然而,AI技术的进步会削弱许多CAPTCHA测试,并可能引起安全关切。在本文中,我们建议一种基于用户的基于文本的CAPTCHA生成方法,名为Robust Text CAPTCHA(RTC) 。在第一阶段,背景和背景的构建是随机抽样字体和背景图像,然后合成为可识别的伪对抗式CAPTCHA。在第二阶段,我们设计和应用一种高度可转让的对立式对立式对立式攻击来更好地阻扰CAPTCHA的解答器。我们的实验包括浅模型,如KNN、SVM和随机森林、各种深层神经网络和OCR模型。实验显示,我们的CTCHA一般和高可用性都低于100万分之故障率。它们也是针对攻击者可能使用的各种防御性技术,包括对抗性训练、数据预处理和人工标记。