Type "a sea otter with a pearl earring by Johannes Vermeer" or "a photo of a teddy bear on a skateboard in Times Square" into OpenAI's DALL-E-2 paint-by-text synthesis engine and you will not be disappointed by the delightful and eerily pertinent results. The ability to synthesize highly realistic images -- with seemingly no limitation other than our imagination -- is sure to yield many exciting and creative applications. These images are also likely to pose new challenges to the photo-forensic community. Motivated by the fact that paint by text is not based on explicit geometric modeling, and the human visual system's often obliviousness to even glaring geometric inconsistencies, we provide an initial exploration of the perspective consistency of DALL-E-2 synthesized images to determine if geometric-based forensic analyses will prove fruitful in detecting this new breed of synthetic media.
翻译:在OpenAI的DALL-E-2的逐字油漆合成引擎中,“一个带有Johannes Vermeer珍珠耳环的海洋水手”或“泰迪熊在时代广场滑板上的照片”,进入OpenAI的DALL-E-2的“DALL-E-2”油漆合成引擎,你不会感到失望。你不会对令人愉快和切合实际的结果感到失望。合成高度现实的图像的能力 — — 似乎除了我们的想象外没有任何限制 — — 肯定会产生许多令人振奋和创造性的应用。这些图像还可能对摄影-反敏感群体构成新的挑战。受以下事实的驱使,即文字涂料并非基于明确的几何模型,人类视觉系统往往忽视甚至明显的几何不一致,我们初步探索了DALL-E-2合成图像的视角一致性,以确定基于几何法的法证分析是否会在探测这种新型合成媒体方面产生成效。