在没有完整注释的病理学图象中进行二进制和多级细胞检测的阳性无标签学习 (Positive-unlabeled learning for binary and multi-class cell detection in histopathology images with incomplete annotations)

from arxiv, Accepted for publication at the Journal of Machine Learning for Biomedical Imaging (MELBA) https://melba-journal.org/2022:027. arXiv admin note: text overlap with arXiv:2106.15918

Cell detection in histopathology images is of great interest to clinical practice and research, and convolutional neural networks (CNNs) have achieved remarkable cell detection results. Typically, to train CNN-based cell detection models, every positive instance in the training images needs to be annotated, and instances that are not labeled as positive are considered negative samples. However, manual cell annotation is complicated due to the large number and diversity of cells, and it can be difficult to ensure the annotation of every positive instance. In many cases, only incomplete annotations are available, where some of the positive instances are annotated and the others are not, and the classification loss term for negative samples in typical network training becomes incorrect. In this work, to address this problem of incomplete annotations, we propose to reformulate the training of the detection network as a positive-unlabeled learning problem. Since the instances in unannotated regions can be either positive or negative, they have unknown labels. Using the samples with unknown labels and the positively labeled samples, we first derive an approximation of the classification loss term corresponding to negative samples for binary cell detection, and based on this approximation we further extend the proposed framework to multi-class cell detection. For evaluation, experiments were performed on four publicly available datasets. The experimental results show that our method improves the performance of cell detection in histopathology images given incomplete annotations for network training.

翻译：生理病理学图像中的细胞检测对临床实践和研究非常感兴趣,而进化神经网络(CNNs)已经取得了显著的细胞检测结果。通常,为了培训CNN的细胞检测模型,培训图像中的每个正面案例都需要附加注释,而没有被贴上呈阳性的样本则被视为呈阳性的样本。然而,由于细胞数量众多且种类繁多,人工细胞批注十分复杂,难以确保每个正面案例的注释。在许多情况下,只有不完整的注释,其中某些正面案例是附加注释的,而另一些则没有,典型网络培训中负面样本的分类损失术语则不正确。在这项工作中,为了解决这种不完整的注释问题,我们建议重新配置检测网络的培训,作为积极的、未贴上阳性标签的学习问题。由于未加注解的地区的情况既可以是正面的,也可能是负面的,因此可能很难确保每个正面实例都得到注释。在很多情况下,只有不完整的注释,我们首先得出与负面细胞检测样本对应的分类损失术语的近似近值,以便进行二进细胞检测,而在这种近似的实验性实验结果上,我们进一步展示了四类的实验性测试结果。