In this paper, we consider a novel framework of positive-unlabeled data in which as positive data survival times are observed for subjects who have events during the observation time as positive data and as unlabeled data censoring times are observed but whether the event occurs or not are unknown for some subjects. We consider two cases: (1) when censoring time is observed in positive data, and (2) when it is not observed. For both cases, we developed parametric models, nonparametric models, and machine learning models and the estimation strategies for these models. Simulation studies show that under this data setup, traditional survival analysis may yield severely biased results, while the proposed estimation method can provide valid results.
翻译:在本文中,我们考虑一个无标签的正数据的新框架,在其中,对在观察期间发生事件、如正数据和无标签数据审查时间的主体,将观察到积极数据存活时间,但某些主体是否发生事件并不为人所知;我们考虑两种情况:(1) 检查时间在正数据中得到遵守,(2) 检查时间未得到遵守,我们为这两种情况制定了参数模型、非参数模型、机器学习模型和这些模型的估计战略。模拟研究表明,在这种数据设置下,传统生存分析可能产生严重偏差的结果,而拟议的估算方法可以提供有效的结果。