NeuraCrypt (Yara et al. arXiv 2021) is an algorithm that converts a sensitive dataset to an encoded dataset so that (1) it is still possible to train machine learning models on the encoded data, but (2) an adversary who has access only to the encoded dataset can not learn much about the original sensitive dataset. We break NeuraCrypt privacy claims, by perfectly solving the authors' public challenge, and by showing that NeuraCrypt does not satisfy the formal privacy definitions posed in the original paper. Our attack consists of a series of boosting steps that, coupled with various design flaws, turns a 1% attack advantage into a 100% complete break of the scheme.
翻译:NeuraCrypt (Yara et al. arXiv 2021) 是一种将敏感数据集转换成编码数据集的算法,因此(1) 仍然有可能对编码数据进行机器学习模型的培训,但(2) 仅能访问编码数据集的对手无法对原始敏感数据集了解多少。 我们打破了NeuraCrypt的隐私主张,完全解决了作者的公开挑战,并表明NeuraCrypt不符合原始文件提出的正式隐私定义。 我们的攻击包括一系列推进步骤,加上各种设计缺陷,将1%的攻击优势转化为100%的完全突破计划。