Text password has served as the most popular method for user authentication so far, and is not likely to be totally replaced in foreseeable future. Password authentication offers several desirable properties (e.g., low-cost, highly available, easy-to-implement, reusable). However, it suffers from a critical security issue mainly caused by the inability to memorize complicated strings of human. Users tend to choose easy-to-remember passwords which are not uniformly distributed in the key space, and are susceptible to guessing attack. In order to encourage and support users to use strong passwords, it is necessary to simulate automate password guessing methods to determine the passwords' strength and identify weak passwords. A large number of password guessing models have been proposed in the literature. However, little attention was paid on the task of providing a systematic survey which is necessary to review the state-of-the-art approaches, identify gaps, and avoid duplicate study. Motivated from that, we conduct a comprehensive survey on all password guessing studies presented in the literature from 1979 to 2022. We propose a generic methodology map of existing models to present an overview of this field, then, subsequently explain each approach in detail. The experimental procedures and available datasets used for evaluating password guessing models are summarized, along with the reported performances of representative studies. Finally, the current limitations and the open problems as future research directions are discussed. We believe that this survey is helpful to both the experts and newcomers who are interested in password security.
翻译:迄今为止,文本密码已成为最受欢迎的用户认证方法,而且不太可能在可预见的将来被完全取代。密码验证提供了几种可取的属性(例如低成本、高可用率、易于使用、易于使用、可再使用)。然而,它遇到了一个关键的安全问题,主要原因是无法记住复杂的人类字符串。用户往往选择容易记住的密码,这些密码在关键空间没有统一分布,容易猜测攻击。为了鼓励和支持用户使用强有力的密码,有必要模拟自动密码猜测方法,以确定密码的强度和识别薄弱的密码。文献中已经提出了大量密码猜测模型。然而,很少注意提供系统调查的任务,而这种调查对于审查最先进的方法、找出差距和避免重复研究是必要的。我们从这个角度出发,对1979年至2022年在文献中提供的所有密码猜测研究进行了全面的调查。我们提出了一套通用的方法图,以介绍这个字段的强度和识别薄弱的密码。随后提出了大量密码猜测模型模型,用以对目前使用的各种数据进行了详细的分析,最后,我们用这些模型来总结了目前使用的业绩限制,然后将用来分析各种研究方向,然后将用来分析各种数据,然后将用来解释。