Recent advances in Deep Learning and Computer Vision have alleviated many of the bottlenecks, allowing algorithms to be label-free with better performance. Specifically, Transformers provide a global perspective of the image, which Convolutional Neural Networks (CNN) lack by design. Here we present Cross Architectural Self-Supervision, a novel self-supervised learning approach which leverages transformers and CNN simultaneously, while also being computationally accessible to general practitioners via easily available cloud services. Compared to existing state-of-the-art self-supervised learning approaches, we empirically show CASS trained CNNs, and Transformers gained an average of 8.5% with 100% labelled data, 7.3% with 10% labelled data, and 11.5% with 1% labelled data, across three diverse datasets. Notably, one of the employed datasets included histopathology slides of an autoimmune disease, a topic underrepresented in Medical Imaging and has minimal data. In addition, our findings reveal that CASS is twice as efficient as other state-of-the-art methods in terms of training time. The code is open source and is available on GitHub.
翻译:深层学习和计算机愿景的近期进展缓解了许多瓶颈,使得算法能够无标签地运行,表现更好。 具体地说, 变异器提供了一种全局图像的全局视角,而进化神经网络(CNN)却因设计而缺乏这种视角。 在这里,我们展示了跨建筑自闭式自我观察,这是一种新的自我监督的学习方法,它同时利用变压器和CNN,同时通过容易获得的云服务使普通从业者可以通过计算获得。 与现有最先进的自我监督的学习方法相比,我们从经验上展示了CASS培训的CNN, 变异器获得了平均8.5%的定位数据,7.3%的标定数据,7.3%的标定数据,11.5%的标定数据,分布在三个不同的数据集中。 值得注意的是,使用的一个数据集包括了自动免疫系统疾病的病理学幻灯片,一个医学成像中比例不足和数据极少。 此外,我们的研究结果显示, CASS在培训时间方面比其他最新方法效率高出一倍。 代码是开放的源和在 GiHub上。