Computer vision and multimedia information processing have made extreme progress within the last decade and many tasks can be done with a level of accuracy as if done by humans, or better. This is because we leverage the benefits of huge amounts of data available for training, we have enormous computer processing available and we have seen the evolution of machine learning as a suite of techniques to process data and deliver accurate vision-based systems. What kind of applications do we use this processing for ? We use this in autonomous vehicle navigation or in security applications, searching CCTV for example, and in medical image analysis for healthcare diagnostics. One application which is not widespread is image or video search directly by users. In this paper we present the need for such image finding or re-finding by examining human memory and when it fails, thus motivating the need for a different approach to image search which is outlined, along with the requirements of computer vision to support it.
翻译:在过去十年里,计算机和多媒体信息处理取得了极大的进展,许多任务可以像人类所做的那样精确地完成。这是因为我们利用大量可用于培训的数据的好处,我们有大量的计算机处理,我们看到机器学习的演进是一套处理数据和提供准确的基于视觉的系统的技术。我们用这种处理来做什么?我们在自主的车辆导航或安全应用中使用这种技术,例如搜索闭路电视,在医疗图像分析中进行医疗诊断。一个没有普及的应用程序是直接由用户进行图像或视频搜索。在本文中,我们提出需要通过检查人类记忆和当其失败时进行这种图像查找或再勘查,从而促使需要一种不同的图像搜索方法,这个方法与计算机愿景的要求一起加以概述。