Can deep learning models achieve greater generalization if their training is guided by reference to human perceptual abilities? And how can we implement this in a practical manner? This paper proposes a training strategy to ConveY Brain Oversight to Raise Generalization (CYBORG). This new approach incorporates human-annotated saliency maps into a loss function that guides the model's learning to focus on image regions that humans deem salient for the task. The Class Activation Mapping (CAM) mechanism is used to probe the model's current saliency in each training batch, juxtapose this model saliency with human saliency, and penalize large differences. Results on the task of synthetic face detection, selected to illustrate the effectiveness of the approach, show that CYBORG leads to significant improvement in accuracy on unseen samples consisting of face images generated from six Generative Adversarial Networks across multiple classification network architectures. We also show that scaling to even seven times the training data, or using non-human-saliency auxiliary information, such as segmentation masks, and standard loss cannot beat the performance of CYBORG-trained models. As a side effect of this work, we observe that the addition of explicit region annotation to the task of synthetic face detection increased human classification accuracy. This work opens a new area of research on how to incorporate human visual saliency into loss functions in practice. All data, code and pre-trained models used in this work are offered with this paper.
翻译:深层次学习模型如果以人的认知能力为指导来指导其培训,能否实现更广义的概括化?以及我们如何以实际方式实施?本文件提出一个培训战略,以传播大脑监督,提高普及性(CYBORG) 。这一新的方法将人文说明突出的地图纳入损失函数,指导模型学习侧重于人类认为任务突出的图像区域。级激活映射(CAM)机制用于探测模型在每批培训中的当前显著特征,将这一模型与人的突出特征并列,并惩罚巨大的差异。合成面部检测任务的结果,为说明该方法的有效性而选取的,显示CYBORG导致在隐蔽样本的准确性方面有显著的改进,这些样本包括由6个人类感化对立网络生成的图像,这些图像在多个分类网络结构中被视为突出的任务。我们还表明,将培训数据扩大至甚至七倍,或者使用非人性辅助信息,例如分解面具,以及标准损失无法击CYBORG培训模型的绩效,并惩罚巨大的差异差异。 合成面探测模型的成绩, 显示,作为这一合成图像测量工作的一个侧面工作,我们观察到了这一研究领域一个明确的分类,将这一研究领域使用了一个新的研究领域。