Content scanning systems employ perceptual hashing algorithms to scan user content for illegal material, such as child pornography or terrorist recruitment flyers. Perceptual hashing algorithms help determine whether two images are visually similar while preserving the privacy of the input images. Several efforts from industry and academia propose to conduct content scanning on client devices such as smartphones due to the impending roll out of end-to-end encryption that will make server-side content scanning difficult. However, these proposals have met with strong criticism because of the potential for the technology to be misused and re-purposed. Our work informs this conversation by experimentally characterizing the potential for one type of misuse -- attackers manipulating the content scanning system to perform physical surveillance on target locations. Our contributions are threefold: (1) we offer a definition of physical surveillance in the context of client-side image scanning systems; (2) we experimentally characterize this risk and create a surveillance algorithm that achieves physical surveillance rates of >40% by poisoning 5% of the perceptual hash database; (3) we experimentally study the trade-off between the robustness of client-side image scanning systems and surveillance, showing that more robust detection of illegal material leads to increased potential for physical surveillance.
翻译:内容扫描系统采用感知性散射算法扫描非法材料的用户内容,例如儿童色情制品或恐怖分子招募传单。感知性散射算法有助于确定两种图像在视觉上是否相似,同时保护输入图像的隐私。产业界和学术界作出若干努力,提议对客户设备,例如智能手机进行内容扫描,因为端对端加密即将推出,使服务器端内容扫描难以进行。然而,这些建议受到强烈批评,因为这种技术有可能被滥用和重新使用。我们的工作通过实验性地说明一种类型的滥用的可能性来为这次对话提供信息 -- -- 攻击者操纵内容扫描系统对目标地点进行物理监视。我们的贡献有三重:(1) 我们提供了客户端图像扫描系统物理监视的定义;(2) 我们实验性地确定这种风险,并创建一种监视算法,通过中毒5%的感知性散列数据库实现物理监视率 > 40%;(3) 我们实验性地研究客户端图像扫描系统与监视系统之间是否稳健,从而显示对非法材料进行更严密的监视潜力。