ContextDrag：基于上下文保持的令牌注入与位置一致性注意力的精确拖拽式图像编辑 (ContextDrag: Precise Drag-Based Image Editing via Context-Preserving Token Injection and Position-Consistent Attention)

Drag-based image editing aims to modify visual content followed by user-specified drag operations. Despite existing methods having made notable progress, they still fail to fully exploit the contextual information in the reference image, including fine-grained texture details, leading to edits with limited coherence and fidelity. To address this challenge, we introduce ContextDrag, a new paradigm for drag-based editing that leverages the strong contextual modeling capability of editing models, such as FLUX-Kontext. By incorporating VAE-encoded features from the reference image, ContextDrag can leverage rich contextual cues and preserve fine-grained details, without the need for finetuning or inversion. Specifically, ContextDrag introduced a novel Context-preserving Token Injection (CTI) that injects noise-free reference features into their correct destination locations via a Latent-space Reverse Mapping (LRM) algorithm. This strategy enables precise drag control while preserving consistency in both semantics and texture details. Second, ContextDrag adopts a novel Position-Consistent Attention (PCA), which positional re-encodes the reference tokens and applies overlap-aware masking to eliminate interference from irrelevant reference features. Extensive experiments on DragBench-SR and DragBench-DR demonstrate that our approach surpasses all existing SOTA methods. Code will be publicly available.

翻译：拖拽式图像编辑旨在根据用户指定的拖拽操作修改视觉内容。尽管现有方法已取得显著进展，但仍未能充分利用参考图像中的上下文信息（包括细粒度纹理细节），导致编辑结果的连贯性与保真度受限。为应对这一挑战，我们提出ContextDrag，一种基于拖拽编辑的新范式，它利用编辑模型（如FLUX-Kontext）强大的上下文建模能力。通过引入参考图像的VAE编码特征，ContextDrag能够利用丰富的上下文线索并保留细粒度细节，而无需微调或反转操作。具体而言，ContextDrag提出了一种新颖的上下文保持令牌注入（CTI）方法，通过潜在空间反向映射（LRM）算法将无噪声的参考特征注入到其正确的目标位置。该策略在实现精确拖拽控制的同时，保持了语义与纹理细节的一致性。其次，ContextDrag采用了一种新颖的位置一致性注意力（PCA）机制，对参考令牌进行位置重编码，并应用重叠感知掩码以消除无关参考特征的干扰。在DragBench-SR和DragBench-DR数据集上的大量实验表明，本方法超越了所有现有SOTA方法。代码将公开提供。

相关内容