As the primary mechanism of digital authentication, user-created passwords exhibit common patterns and regularities that can be learned from leaked datasets. Password choices are profoundly shaped by external factors, including social contexts, cultural trends, and popular vocabulary. Prevailing password guessing models primarily emphasize patterns derived from leaked passwords, while neglecting these external influences -- a limitation that hampers their adaptability to emerging password trends and erodes their effectiveness over time. To address these challenges, we propose KAPG, a knowledge-augmented password guessing framework that adaptively integrates external lexical knowledge into the guessing process. KAPG couples internal statistical knowledge learned from leaked passwords with external information that reflects real-world trends. By using password prefixes as anchors for knowledge lookup, it dynamically injects relevant external cues during generation while preserving the structural regularities of authentic passwords. Experiments on twelve leaked datasets show that KnowGuess achieves average improvements of 36.5\% and 74.7\% over state-of-the-art models in intra-site and cross-site scenarios, respectively. Further analyses of password overlap and model efficiency highlight its robustness and computational efficiency. To counter these attacks, we further develop KAPSM, a trend-aware and site-specific password strength meter. Experiments demonstrate that KAPSM significantly outperforms existing tools in accuracy across diverse evaluation settings.
翻译:作为数字认证的主要机制,用户创建的密码呈现出可从泄露数据集中学习的常见模式和规律。密码选择深受外部因素影响,包括社会背景、文化趋势和流行词汇。主流密码猜测模型主要强调从泄露密码中推导出的模式,却忽视了这些外部影响——这一局限削弱了其对新兴密码趋势的适应性,并随时间推移降低其有效性。为应对这些挑战,我们提出KAPG,一种知识增强的密码猜测框架,能够自适应地将外部词汇知识整合到猜测过程中。KAPG将内部从泄露密码习得的统计知识与反映现实世界趋势的外部信息相结合。通过使用密码前缀作为知识检索的锚点,它在生成过程中动态注入相关外部线索,同时保持真实密码的结构规律性。在十二个泄露数据集上的实验表明,KnowGuess在站内和跨站场景中分别比最先进模型平均提升36.5%和74.7%。对密码重叠度和模型效率的进一步分析凸显了其鲁棒性和计算效率。为应对这些攻击,我们进一步开发了KAPSM,一种趋势感知且站点特定的密码强度评估器。实验证明,KAPSM在不同评估设置下的准确性均显著优于现有工具。