面向身份的训练，面向可控性的推理：一种免调优人脸个性化统一方法 (Training for Identity, Inference for Controllability: A Unified Approach to Tuning-Free Face Personalization)

Tuning-free face personalization methods have developed along two distinct paradigms: text embedding approaches that map facial features into the text embedding space, and adapter-based methods that inject features through auxiliary cross-attention layers. While both paradigms have shown promise, existing methods struggle to simultaneously achieve high identity fidelity and flexible text controllability. We introduce UniID, a unified tuning-free framework that synergistically integrates both paradigms. Our key insight is that when merging these approaches, they should mutually reinforce only identity-relevant information while preserving the original diffusion prior for non-identity attributes. We realize this through a principled training-inference strategy: during training, we employ an identity-focused learning scheme that guides both branches to capture identity features exclusively; at inference, we introduce a normalized rescaling mechanism that recovers the text controllability of the base diffusion model while enabling complementary identity signals to enhance each other. This principled design enables UniID to achieve high-fidelity face personalization with flexible text controllability. Extensive experiments against six state-of-the-art methods demonstrate that UniID achieves superior performance in both identity preservation and text controllability. Code will be available at https://github.com/lyuPang/UniID

翻译：免调优人脸个性化方法已沿着两种不同范式发展：一种是将面部特征映射到文本嵌入空间的文本嵌入方法，另一种是通过辅助交叉注意力层注入特征的适配器方法。尽管两种范式均展现出潜力，现有方法难以同时实现高身份保真度与灵活的文本可控性。我们提出了UniID，一种统一免调优框架，能协同整合两种范式。我们的核心见解是：当融合这两种方法时，它们应仅相互增强身份相关信息，同时保留原始扩散先验以处理非身份属性。我们通过一种原则性的训练-推理策略实现这一目标：在训练阶段，我们采用聚焦身份的学习方案，引导两个分支专门捕获身份特征；在推理阶段，我们引入归一化重缩放机制，在恢复基础扩散模型文本可控性的同时，使互补的身份信号能够相互增强。这种原则性设计使UniID能够实现高保真人脸个性化与灵活的文本可控性。针对六种最先进方法的广泛实验表明，UniID在身份保持和文本可控性方面均实现了卓越性能。代码将在https://github.com/lyuPang/UniID公开。