SafeGen：在文本到图像生成中嵌入伦理安全防护 (SafeGen: Embedding Ethical Safeguards in Text-to-Image Generation)

Generative Artificial Intelligence (AI) has created unprecedented opportunities for creative expression, education, and research. Text-to-image systems such as DALL.E, Stable Diffusion, and Midjourney can now convert ideas into visuals within seconds, but they also present a dual-use dilemma, raising critical ethical concerns: amplifying societal biases, producing high-fidelity disinformation, and violating intellectual property. This paper introduces SafeGen, a framework that embeds ethical safeguards directly into the text-to-image generation pipeline, grounding its design in established principles for Trustworthy AI. SafeGen integrates two complementary components: BGE-M3, a fine-tuned text classifier that filters harmful or misleading prompts, and Hyper-SD, an optimized diffusion model that produces high fidelity, semantically aligned images. Built on a curated multilingual (English- Vietnamese) dataset and a fairness-aware training process, SafeGen demonstrates that creative freedom and ethical responsibility can be reconciled within a single workflow. Quantitative evaluations confirm its effectiveness, with Hyper-SD achieving IS = 3.52, FID = 22.08, and SSIM = 0.79, while BGE-M3 reaches an F1-Score of 0.81. An ablation study further validates the importance of domain-specific fine-tuning for both modules. Case studies illustrate SafeGen's practical impact in blocking unsafe prompts, generating inclusive teaching materials, and reinforcing academic integrity.

翻译：生成式人工智能（AI）为创意表达、教育和研究创造了前所未有的机遇。诸如DALL·E、Stable Diffusion和Midjourney等文本到图像系统如今能在数秒内将想法转化为视觉内容，但它们也呈现出双重用途困境，引发了关键的伦理关切：放大社会偏见、生成高保真虚假信息以及侵犯知识产权。本文提出SafeGen框架，该框架将伦理安全防护直接嵌入文本到图像生成流程，其设计基于可信AI的既定原则。SafeGen整合了两个互补组件：BGE-M3（一个用于过滤有害或误导性提示的微调文本分类器）和Hyper-SD（一个生成高保真、语义对齐图像的优化扩散模型）。基于精心构建的多语言（英语-越南语）数据集和公平感知训练流程，SafeGen证明了创意自由与伦理责任可在单一工作流中实现协调。定量评估证实了其有效性：Hyper-SD实现了IS=3.52、FID=22.08和SSIM=0.79，而BGE-M3的F1分数达到0.81。消融研究进一步验证了领域特定微调对两个模块的重要性。案例研究展示了SafeGen在阻断不安全提示、生成包容性教学材料以及强化学术诚信方面的实际影响。