The evaluation of the Human Epidermal growth factor Receptor-2 (HER2) expression is an important prognostic biomarker for breast cancer treatment selection. However, HER2 scoring has notoriously high interobserver variability due to stain variations between centers and the need to estimate visually the staining intensity in specific percentages of tumor area. In this paper, focusing on the interpretability of HER2 scoring by a pathologist, we propose a semi-automatic, two-stage deep learning approach that directly evaluates the clinical HER2 guidelines defined by the American Society of Clinical Oncology/ College of American Pathologists (ASCO/CAP). In the first stage, we segment the invasive tumor over the user-indicated Region of Interest (ROI). Then, in the second stage, we classify the tumor tissue into four HER2 classes. For the classification stage, we use weakly supervised, constrained optimization to find a model that classifies cancerous patches such that the tumor surface percentage meets the guidelines specification of each HER2 class. We end the second stage by freezing the model and refining its output logits in a supervised way to all slide labels in the training set. To ensure the quality of our dataset's labels, we conducted a multi-pathologist HER2 scoring consensus. For the assessment of doubtful cases where no consensus was found, our model can help by interpreting its HER2 class percentages output. We achieve a performance of 0.78 in F1-score on the test set while keeping our model interpretable for the pathologist, hopefully contributing to interpretable AI models in digital pathology.
翻译:人类流行病生长因子受体-2 (HER2) 表现的评价是乳腺癌治疗选择的一个重要预测性生物标志。然而,HER2评分由于各中心之间的污点变化以及有必要对肿瘤地区特定百分比的污点强度进行直观估计而臭名昭著的观察者间变异性,因此,HER2评分臭名昭著。在本论文中,侧重于病理学家对HER2评分的可解释性的评价,我们提出了一个半自动的、两阶段深层次的学习方法,直接评价美国临床肿瘤学学会/美国病理学家学院(ASCO/CAP) 所确定的临床HER2 指导方针。在第一阶段,我们将入侵性肿瘤分在用户指定的利益区(ROI)上。然后,在第二阶段,我们将肿瘤组织组织分为四个HER2类的污点。在分类阶段,我们使用薄弱的监管、制约优化来找到一种模型,将肿瘤表面百分比分解为每一类HER2的可解释性能。我们通过冻结模型和完善其产出的逻辑记录,我们所有幻灯片标签的精确记录记录,在多级的评分数中,我们进行一个测试。