Camera scene detection is among the most popular computer vision problem on smartphones. While many custom solutions were developed for this task by phone vendors, none of the designed models were available publicly up until now. To address this problem, we introduce the first Mobile AI challenge, where the target is to develop quantized deep learning-based camera scene classification solutions that can demonstrate a real-time performance on smartphones and IoT platforms. For this, the participants were provided with a large-scale CamSDD dataset consisting of more than 11K images belonging to the 30 most important scene categories. The runtime of all models was evaluated on the popular Apple Bionic A11 platform that can be found in many iOS devices. The proposed solutions are fully compatible with all major mobile AI accelerators and can demonstrate more than 100-200 FPS on the majority of recent smartphone platforms while achieving a top-3 accuracy of more than 98%. A detailed description of all models developed in the challenge is provided in this paper.
翻译:相机场景探测是智能手机上最受欢迎的计算机视觉问题之一。 虽然电话供应商为此任务开发了许多定制解决方案, 但迄今为止还没有公开提供任何设计模型。 为了解决这个问题, 我们引入了第一个移动AI挑战, 目标是开发量化的深层基于学习的相机场景分类解决方案, 以显示智能手机和 IoT 平台上的实时性能。 为此, 向参与者提供了大型 CAMSDD数据集, 其中包括属于30个最重要的场景类别的11K多张图像。 所有模型的运行时间都经过了可在许多iOS 设备中找到的流行的苹果Bionic A11 平台的评估。 拟议的解决方案与所有主要的移动AI 加速器完全兼容, 可以在最近的大多数智能手机平台上展示100- 200 FPS, 同时达到98 %以上的最高三级精度。 本文详细描述了所有在挑战中开发的模型。