Conformal inference is a method that provides prediction sets for machine learning models, operating independently of the underlying distributional assumptions and relying solely on the exchangeability of training and test data. Despite its wide applicability and popularity, its application in graph-structured problems remains underexplored. This paper addresses this gap by developing an approach that leverages the rich information encoded in the graph structure of predicted classes to enhance the interpretability of conformal sets. Using a motivating example from genomics, specifically imaging-based spatial transcriptomics data and single-cell RNA sequencing data, we demonstrate how incorporating graph-structured constraints can improve the interpretation of cell type predictions. This approach aims to generate more coherent conformal sets that align with the inherent relationships among classes, facilitating clearer and more intuitive interpretations of model predictions. Additionally, we provide a technique to address non-exchangeability, particularly when the distribution of the response variable changes between training and test datasets. We implemented our method in the open-source R package scConform, available at https://github.com/ccb-hms/scConform.
翻译:暂无翻译