There are many computer vision applications including object segmentation, classification, object detection, and reconstruction for which machine learning (ML) shows state-of-the-art performance. Nowadays, we can build ML tools for such applications with real-world accuracy. However, each tool works well within the domain in which it has been trained and developed. Often, when we train a model on a dataset in one specific domain and test on another unseen domain known as an out of distribution (OOD) dataset, models or ML tools show a decrease in performance. For instance, when we train a simple classifier on real-world images and apply that model on the same classes but with a different domain like cartoons, paintings or sketches then the performance of ML tools disappoints. This presents serious challenges of domain generalisation (DG), domain adaptation (DA), and domain shifting. To enhance the power of ML tools, we can rebuild and retrain models from scratch or we can perform transfer learning. In this paper, we present a comparison study between vision-based technologies for domain-specific and domain-generalised methods. In this research we highlight that simple convolutional neural network (CNN) based deep learning methods perform poorly when they have to tackle domain shifting. Experiments are conducted on two popular vision-based benchmarks, PACS and Office-Home. We introduce an implementation pipeline for domain generalisation methods and conventional deep learning models. The outcome confirms that CNN-based deep learning models show poor generalisation compare to other extensive methods.
翻译:有许多计算机视觉应用软件,包括物体分割、分类、物体探测和重建等,机器学习(ML)显示最先进的性能。如今,我们可以为这些应用建立ML工具,并具有真实世界的准确性。然而,每个工具都在其培训和开发的领域内运作良好。当我们在一个特定领域培训数据集模型并在另一个被称为分配外(OOOD)数据集、模型或ML工具的无形领域进行测试时,这些模型的性能下降。例如,当我们训练一个关于真实世界图像的简单分类,并在同一个班级上应用该模型,但该模型则具有不同的领域,如漫画、绘画或草图,然后ML工具的性能则令人失望。这提出了领域常规化(DG)、域适应(DA)和域转移等领域的严重挑战。为了增强ML工具的力量,我们可以从抓起重建并重新引入模型,或者我们可以进行转移学习。在本文中,我们对基于域域和域通用方法的基于视觉的技术进行了比较研究。在这个研究中,我们强调,简单的革命模型在进行深度的域域内学习时,我们学习了一种不甚深的模型。