可控层分解用于可逆多层图像生成 (Controllable Layer Decomposition for Reversible Multi-Layer Image Generation)

This work presents Controllable Layer Decomposition (CLD), a method for achieving fine-grained and controllable multi-layer separation of raster images. In practical workflows, designers typically generate and edit each RGBA layer independently before compositing them into a final raster image. However, this process is irreversible: once composited, layer-level editing is no longer possible. Existing methods commonly rely on image matting and inpainting, but remain limited in controllability and segmentation precision. To address these challenges, we propose two key modules: LayerDecompose-DiT (LD-DiT), which decouples image elements into distinct layers and enables fine-grained control; and Multi-Layer Conditional Adapter (MLCA), which injects target image information into multi-layer tokens to achieve precise conditional generation. To enable a comprehensive evaluation, we build a new benchmark and introduce tailored evaluation metrics. Experimental results show that CLD consistently outperforms existing methods in both decomposition quality and controllability. Furthermore, the separated layers produced by CLD can be directly manipulated in commonly used design tools such as PowerPoint, highlighting its practical value and applicability in real-world creative workflows. Our project is available at https://monkek123king.github.io/CLD_page/.

翻译：本研究提出了可控层分解（CLD）方法，用于实现栅格图像的细粒度可控多层分离。在实际工作流程中，设计师通常独立生成和编辑每个RGBA层，随后将其合成为最终栅格图像。然而，此过程不可逆：一旦合成，便无法再进行图层级编辑。现有方法通常依赖图像抠图和修复技术，但在可控性和分割精度方面仍存在局限。为应对这些挑战，我们提出了两个关键模块：LayerDecompose-DiT（LD-DiT），用于将图像元素解耦至独立图层并实现细粒度控制；以及多层条件适配器（MLCA），通过将目标图像信息注入多层标记来实现精确条件生成。为进行全面评估，我们构建了新的基准数据集并引入了定制化评估指标。实验结果表明，CLD在分解质量和可控性方面均优于现有方法。此外，CLD生成的分离图层可直接在PowerPoint等常用设计工具中进行编辑，突显了其在实际创意工作流程中的实用价值与应用潜力。项目地址：https://monkek123king.github.io/CLD_page/。