We develop the first universal password model -- a password model that, once pre-trained, can automatically adapt to any password distribution. To achieve this result, the model does not need to access any plaintext passwords from the target set. Instead, it exploits users' auxiliary information, such as email addresses, as a proxy signal to predict the underlying target password distribution. The model uses deep learning to capture the correlation between the auxiliary data of a group of users (e.g., users of a web application) and their passwords. It then exploits those patterns to create a tailored password model for the target community at inference time. No further training steps, targeted data collection, or prior knowledge of the community's password distribution is required. Besides defining a new state-of-the-art for password strength estimation, our model enables any end-user (e.g., system administrators) to autonomously generate tailored password models for their systems without the often unworkable requirement of collecting suitable training data and fitting the underlying password model. Ultimately, our framework enables the democratization of well-calibrated password models to the community, addressing a major challenge in the deployment of password security solutions on a large scale.
翻译:我们开发了第一个通用密码模式 -- -- 一种密码模式,一旦经过培训,就可以自动适应任何密码分发。为了实现这一结果,该模式不需要从目标集中获取任何简单的密码。相反,它利用用户的辅助信息,例如电子邮件地址,作为预测基本目标密码分发的代用信号。该模式利用深层学习来捕捉一组用户(例如网络应用程序的用户)及其密码的辅助数据之间的相互关系。然后利用这些模式为目标群体在推论时间创建定制的密码模式。不需要进一步的培训步骤、有针对性的数据收集或社区密码分发的先前知识。除了为密码的强度估计确定新的状态外,我们的模式还使任何终端用户(例如系统管理员)能够自主地为其系统生成定制的密码模式,而不需要收集适当的培训数据并适应基本的密码模式。最终,我们的框架使得社区能够实现对口令模式进行标准化的民主化,解决大规模部署密码安全解决方案中的一项重大挑战。