Image Forgery Localization (IFL) is a crucial task in image forensics, aimed at accurately identifying manipulated or tampered regions within an image at the pixel level. Existing methods typically generate a single deterministic localization map, which often lacks the precision and reliability required for high-stakes applications such as forensic analysis and security surveillance. To enhance the credibility of predictions and mitigate the risk of errors, we introduce an advanced Conditional Bernoulli Diffusion Model (CBDiff). Given a forged image, CBDiff generates multiple diverse and plausible localization maps, thereby offering a richer and more comprehensive representation of the forgery distribution. This approach addresses the uncertainty and variability inherent in tampered regions. Furthermore, CBDiff innovatively incorporates Bernoulli noise into the diffusion process to more faithfully reflect the inherent binary and sparse properties of forgery masks. Additionally, CBDiff introduces a Time-Step Cross-Attention (TSCAttention), which is specifically designed to leverage semantic feature guidance with temporal steps to improve manipulation detection. Extensive experiments on eight publicly benchmark datasets demonstrate that CBDiff significantly outperforms existing state-of-the-art methods, highlighting its strong potential for real-world deployment.
翻译:图像伪造定位是图像取证中的一项关键任务,旨在像素级别精确识别图像中被篡改或伪造的区域。现有方法通常生成单一确定性定位图,这往往缺乏法证分析和安全监控等高风险应用所需的精度和可靠性。为提升预测的可信度并降低错误风险,我们提出一种先进的条件伯努利扩散模型。给定伪造图像,CBDiff能生成多个多样且合理的定位图,从而提供更丰富、更全面的伪造分布表示。该方法有效解决了篡改区域固有的不确定性和多变性。此外,CBDiff创新性地将伯努利噪声引入扩散过程,以更真实地反映伪造掩码固有的二值稀疏特性。同时,CBDiff提出时间步交叉注意力机制,专门设计用于结合语义特征引导与时间步信息以提升篡改检测性能。在八个公开基准数据集上的大量实验表明,CBDiff显著优于现有最先进方法,凸显了其在实际部署中的强大潜力。