Advancements in AI image generation, particularly diffusion models, have progressed rapidly. However, the absence of an established framework for quantifying the reliability of AI-generated images hinders their use in critical decision-making tasks, such as medical image diagnosis. In this study, we address the task of detecting anomalous regions in medical images using diffusion models and propose a statistical method to quantify the reliability of the detected anomalies. The core concept of our method involves a selective inference framework, wherein statistical tests are conducted under the condition that the images are produced by a diffusion model. With our approach, the statistical significance of anomaly detection results can be quantified in the form of a $p$-value, enabling decision-making with controlled error rates, as is standard in medical practice. We demonstrate the theoretical soundness and practical effectiveness of our statistical test through numerical experiments on both synthetic and brain image datasets.
翻译:暂无翻译