Surgical data science is a new research field that aims to observe all aspects and factors of the patient treatment process in order to provide the right assistance to the right person at the right time. Due to the breakthrough successes of deep learning-based solutions for automatic image annotation, the availability of reference annotations for algorithm training is becoming a major bottleneck in the field. The purpose of this paper was to investigate the concept of self-supervised learning to address this issue. Our approach is guided by the hypothesis that unlabeled video data can be used to learn a representation of the target domain that boosts the performance of state-of-the-art machine learning algorithms when used for pre-training. Essentially, this method involves an auxiliary task that requires training with unlabeled endoscopic video data from the target domain to initialize a convolutional neural network (CNN) for the target task. In this paper, we propose to undertake a re-colorization of medical images with generative adversarial network (GAN)-based architecture as an auxiliary task. A variant of the method involves a second pre-training step based on labeled data for the target task from a related domain. We have validated both variants using medical instrument segmentation as the target task. The proposed approach can be used to radically reduce the manual annotation effort involved in training CNNs. Compared to the baseline approach of generating annotated data from scratch, our method decreases exploratively the number of labeled images by up to 60% without sacrificing performance. Our method also outperforms alternative methods for CNN pre-training, such as pre-training on publicly available non-medical (COCO) or medical data (MICCAI endoscopic vision challenge 2017) using the target task (in this instance: segmentation).
翻译:外科数据科学是一个新的研究领域,目的是观察病人治疗过程的所有方面和因素,以便在正确的时间向正确的人提供正确的帮助。由于深层次学习的自动图像注释化解决方案取得了突破性的成功,算法培训的参考说明正在成为实地的一个主要瓶颈。本文的目的是调查自我监督学习的概念,以解决这一问题。我们的方法所依据的假设是,可以使用未贴标签的视频数据来学习目标域,从而在培训前向正确的人提供正确的帮助。由于深层次学习基于自动图像注释化的学习解决方案取得了突破性的成功,因此,对算法培训的参考说明说明正在成为该领域的一个主要瓶颈。在本文中,我们建议对医学图象进行重新加色化,同时以变色化的对抗前网络(GAN)为基础的结构,作为辅助任务。这个方法的变式是,在用于培训前的60种最高级机器学习算法之前,在用于目标性分析的域中,使用不贴标签的内部数据,也可以使用比值性分析方法,在用于任务变式任务域中,用一种变式数据进行。