Can we predict if an early stage cancer patient is at high risk of developing distant metastasis and what clinicopathological factors are associated with such a risk? In this paper, we propose a ranking based censoring-aware machine learning model for answering such questions. The proposed model is able to generate an interpretable formula for risk stratifi-cation using a minimal number of clinicopathological covariates through L1-regulrization. Using this approach, we analyze the association of time to distant metastasis (TTDM) with various clinical parameters for early stage, luminal (ER+ or HER2-) breast cancer patients who received endocrine therapy but no chemotherapy (n = 728). The TTDM risk stratification formula obtained using the proposed approach is primarily based on mitotic score, histolog-ical tumor type and lymphovascular invasion. These findings corroborate with the known role of these covariates in increased risk for distant metastasis. Our analysis shows that the proposed risk stratification formula can discriminate between cases with high and low risk of distant metastasis (p-value < 0.005) and can also rank cases based on their time to distant metastasis with a concordance-index of 0.73.
 翻译:我们能否预测早期癌症患者是否面临发展远程转移的高风险,以及哪些临床病理因素与此类风险相关联?在本文件中,我们提出一个基于排序的检查-感知机器学习模型,用于回答这些问题。拟议模型能够产生一个解释性风险分解公式,利用最低数量的L1-再转化的临床病理共异变,利用最低数量的L1-再转化,利用少量L1-再转化产生风险分解公式。我们采用这种方法,分析时间与远程转移(TTDDM)的联系,并有各种早期临床参数,即接受内内分泌治疗但无化学治疗(n=728)的开源(ER+或HER2-)乳癌病人接受内分内分内分内分内分治疗但无化学治疗(n=728)的乳房癌病人。我们的分析表明,采用拟议方法获得的TTTDDDR风险分主要基于线评分、外科肿瘤肿瘤肿瘤肿瘤肿瘤型和淋巴血管入侵,这些结果与已知的这些变异变作用在远程转移转移转移转移风险增加风险增加风险中所起的作用证实了。我们的分析表明,拟议的风险确认风险公式可以区分高低和低的病例(与高低向远相相相向时间(与远相稳定为30的PSAL-谢),以及根据它们以其摩/0.3.(与0.3030/0.30/0.30)和级别,并制)的案件,以及根据级,还以其平级,还以其平基,还基于其平,根据高到0.30/摩(P-摩/0.3的