Recent advances in Deep Learning and Computer Vision have alleviated many of the bottlenecks, allowing algorithms to be label-free with better performance. Specifically, Transformers provide a global perspective of the image, which Convolutional Neural Networks (CNN) lack by design. Here we present \textbf{C}ross \textbf{A}rchitectural - \textbf{S}elf \textbf{S}upervision , a novel self-supervised learning approach which leverages transformers and CNN simultaneously, while also being computationally accessible to general practitioners via easily available cloud services. Compared to existing state-of-the-art self-supervised learning approaches, we empirically show CASS trained CNNs, and Transformers gained an average of 8.5\% with 100\% labelled data, 7.3\% with 10\% labelled data, and 11.5\% with 1\% labelled data, across three diverse datasets. Notably, one of the employed datasets included histopathology slides of an autoimmune disease, a topic underrepresented in Medical Imaging and has minimal data. In addition, our findings reveal that CASS is twice as efficient as other state-of-the-art methods in terms of training time.
翻译:深层学习和计算机愿景的近期进展缓解了许多瓶颈,使算法能够不贴标签,提高性能。 具体地说, 变异器提供了图像的全球视角, 革命神经网络(CNN)因设计而缺乏这种视角。 我们在这里展示了\ textbf{C}ross\ textbff{C}{A}}rchitectural -\\ textbf{S}elf\ textbf{S}Opervision, 这是一种新型的自我监督学习方法,同时利用变异器和CNN,同时通过容易获得的云服务对普通从业者进行计算。 与现有的最先进的自我监督学习方法相比,我们实证地展示了CASS所培训的CNNS, 变异体者获得的平均为8.5 ⁇, 包括100 ⁇ 的标签数据, 7.3 ⁇ 和11.5 ⁇ 的标签数据, 三个不同的数据集。 值得注意的是, 使用的一个数据集包括一个自发式病历幻灯片, 一个在医学成像中代表的题目, 在医学成像和CASAL中以两次展示了其他有效的方法。