Clustering models constitute a class of unsupervised machine learning methods which are used in a number of application pipelines, and play a vital role in modern data science. With recent advancements in deep learning -- deep clustering models have emerged as the current state-of-the-art over traditional clustering approaches, especially for high-dimensional image datasets. While traditional clustering approaches have been analyzed from a robustness perspective, no prior work has investigated adversarial attacks and robustness for deep clustering models in a principled manner. To bridge this gap, we propose a blackbox attack using Generative Adversarial Networks (GANs) where the adversary does not know which deep clustering model is being used, but can query it for outputs. We analyze our attack against multiple state-of-the-art deep clustering models and real-world datasets, and find that it is highly successful. We then employ some natural unsupervised defense approaches, but find that these are unable to mitigate our attack. Finally, we attack Face++, a production-level face clustering API service, and find that we can significantly reduce its performance as well. Through this work, we thus aim to motivate the need for truly robust deep clustering models.
翻译:集束模型构成了一组不受监督的机器学习方法,这些方法在一些应用管道中使用,并在现代数据科学中发挥着至关重要的作用。随着最近深层次学习的进步,深层集束模型作为目前最先进的传统集束方法出现,特别是高维图像数据集。虽然传统的集束方法已经从稳健的角度进行了分析,但没有以前的工作调查了激烈的对立攻击和深层集束模型的稳健性。为了缩小这一差距,我们提议使用General Adversarial 网络(GANs)进行黑盒攻击,因为对手并不知道正在使用哪个深层集束模型,但可以询问其产出。我们分析了我们对多层最先进的集束模型和真实世界数据集的打击,发现其非常成功。我们随后采用了一些自然的不受监督的防御方法,但发现这些方法无法减轻我们的进攻。最后,我们攻击了Face++,一个生产级的面层组合 API 服务,发现我们可以大幅降低其性能。我们通过这项工作,因此我们的目标是为真正深入的需要建立强大的集束。