Biomedical studies have revealed the crucial role of miRNAs in the progression of many diseases, and computational prediction methods are increasingly proposed for assisting biological experiments to verify miRNA-disease associations (MDAs). The generalizability is a significant issue, the prediction ought to be available for entities with fewer or without existing MDAs, while it is previously underemphasized. In this study, we work on the stages of data, model, and result analysis. First, we integrate multi-source data into a miRNA-PCG-disease graph, embracing all authoritative recorded human miRNAs and diseases, and the verified MDAs are split by time and known degree as a benchmark. Second, we propose an end-to-end data-driven model that avoids taking the existing MDAs as an input feature. It performs node feature encoding, graph structure learning, and binary prediction centered on a heterogeneous graph transformer. Finally, computational experiments indicate that our method achieves state-of-the-art performance on basic metrics and effectively alleviates the neglect of less and zero known miRNAs and diseases. Predictions are conducted on all human miRNA-disease pairs, case studies further demonstrate that we can make reliable MDA detections on unseen diseases, and the prediction basis is instance-level explainable.
翻译:暂无翻译