AI-driven molecular generation is reshaping drug discovery and materials design, yet the lack of protection mechanisms leaves AI-generated molecules vulnerable to unauthorized reuse and provenance ambiguity. Such limitation undermines both scientific reproducibility and intellectual property security. To address this challenge, we propose the first deep learning based watermarking framework for molecules (MolMark), which is exquisitely designed to embed high-fidelity digital signatures into molecules without compromising molecular functionalities. MolMark learns to modulate the chemically meaningful atom-level representations and enforce geometric robustness through SE(3)-invariant features, maintaining robustness under rotation, translation, and reflection. Additionally, MolMark integrates seamlessly with AI-based molecular generative models, enabling watermarking to be treated as a learned transformation with minimal interference to molecular structures. Experiments on benchmark datasets (QM9, GEOM-DRUG) and state-of-the-art molecular generative models (GeoBFN, GeoLDM) demonstrate that MolMark can embed 16-bit watermarks while retaining more than 90% of essential molecular properties, preserving downstream performance, and enabling >95% extraction accuracy under SE(3) transformations. MolMark establishes a principled pathway for unifying molecular generation with verifiable authorship, supporting trustworthy and accountable AI-driven molecular discovery.
翻译:人工智能驱动的分子生成正在重塑药物发现和材料设计,然而保护机制的缺乏使得AI生成的分子容易遭受未经授权的重用和来源模糊。这种限制既损害了科学可重复性,也威胁到知识产权安全。为应对这一挑战,我们提出了首个基于深度学习的分子水印框架(MolMark),其精巧设计旨在将高保真数字签名嵌入分子中,同时不损害分子功能。MolMark通过学习调制具有化学意义的原子级表示,并利用SE(3)不变特征增强几何鲁棒性,从而在旋转、平移和反射变换下保持稳定性。此外,MolMark能够与基于AI的分子生成模型无缝集成,使得水印嵌入可被视为一种学习到的变换,对分子结构的干扰极小。在基准数据集(QM9, GEOM-DRUG)和前沿分子生成模型(GeoBFN, GeoLDM)上的实验表明,MolMark能够嵌入16位水印,同时保留超过90%的关键分子性质,维持下游任务性能,并在SE(3)变换下实现>95%的水印提取准确率。MolMark为统一分子生成与可验证的作者身份建立了一条原则性路径,从而支持可信且可问责的AI驱动分子发现。