Tuning-free face personalization methods have developed along two distinct paradigms: text embedding approaches that map facial features into the text embedding space, and adapter-based methods that inject features through auxiliary cross-attention layers. While both paradigms have shown promise, existing methods struggle to simultaneously achieve high identity fidelity and flexible text controllability. We introduce UniID, a unified tuning-free framework that synergistically integrates both paradigms. Our key insight is that when merging these approaches, they should mutually reinforce only identity-relevant information while preserving the original diffusion prior for non-identity attributes. We realize this through a principled training-inference strategy: during training, we employ an identity-focused learning scheme that guides both branches to capture identity features exclusively; at inference, we introduce a normalized rescaling mechanism that recovers the text controllability of the base diffusion model while enabling complementary identity signals to enhance each other. This principled design enables UniID to achieve high-fidelity face personalization with flexible text controllability. Extensive experiments against six state-of-the-art methods demonstrate that UniID achieves superior performance in both identity preservation and text controllability. Code will be available at https://github.com/lyuPang/UniID
翻译:免调优人脸个性化方法已沿着两种不同范式发展:一种是将面部特征映射到文本嵌入空间的文本嵌入方法,另一种是通过辅助交叉注意力层注入特征的适配器方法。尽管两种范式均展现出潜力,现有方法难以同时实现高身份保真度与灵活的文本可控性。我们提出了UniID,一种统一免调优框架,能协同整合两种范式。我们的核心见解是:当融合这两种方法时,它们应仅相互增强身份相关信息,同时保留原始扩散先验以处理非身份属性。我们通过一种原则性的训练-推理策略实现这一目标:在训练阶段,我们采用聚焦身份的学习方案,引导两个分支专门捕获身份特征;在推理阶段,我们引入归一化重缩放机制,在恢复基础扩散模型文本可控性的同时,使互补的身份信号能够相互增强。这种原则性设计使UniID能够实现高保真人脸个性化与灵活的文本可控性。针对六种最先进方法的广泛实验表明,UniID在身份保持和文本可控性方面均实现了卓越性能。代码将在https://github.com/lyuPang/UniID公开。