Recently, creative generative artificial intelligence software has emerged as a pivotal assistant, enabling users to generate content and seek inspiration rapidly. Text-to-image (T2I) software, being one of the most widely used among them, is used to synthesize images with simple text input by engaging in a cross-modal process. However, despite substantial advancements in several fields, T2I software often encounters defects and erroneous, including omitting focal entities, low image realism, and mismatched text-image information. The cross-modal nature of T2I software makes it challenging for traditional testing methods to detect defects. Lacking test oracles further increases the complexity of testing. To address this deficiency, we propose ACTesting, an Automated Cross-modal Testing Method of Text-to-Image software, the first testing method designed specifically for T2I software. We construct test samples based on entities and relationship triples following the fundamental principle of maintaining consistency in the semantic information to overcome the cross-modal matching challenges. To address the issue of testing oracle scarcity, we first design the metamorphic relation for T2I software and implement three types of mutation operators guided by adaptability density. In the experiment, we conduct ACTesting on four widely-used T2I software. The results show that ACTesting can generate error-revealing tests, reducing the text-image consistency by up to 20% compared with the baseline. We also conduct the ablation study that effectively showcases the efficacy of each mutation operator, based on the proposed metamorphic relation. The results demonstrate that ACTesting can identify abnormal behaviors of T2I software effectively.
翻译:暂无翻译