Multiview camera setups have proven useful in many computer vision applications for reducing ambiguities, mitigating occlusions, and increasing field-of-view coverage. However, the high computational cost associated with multiple views poses a significant challenge for end devices with limited computational resources. To address this issue, we propose a view selection approach that analyzes the target object or scenario from given views and selects the next best view for processing. Our approach features a reinforcement learning based camera selection module, MVSelect, that not only selects views but also facilitates joint training with the task network. Experimental results on multiview classification and detection tasks show that our approach achieves promising performance while using only 2 or 3 out of N available views, significantly reducing computational costs. Furthermore, analysis on the selected views reveals that certain cameras can be shut off with minimal performance impact, shedding light on future camera layout optimization for multiview systems. Code is available at https://github.com/hou-yz/MVSelect.
翻译:多视相机设置在许多计算机视觉应用中已证明对减少模糊性、减少排斥和增加视野覆盖面非常有用。然而,多种观点相关的高计算成本对计算资源有限的终端设备构成重大挑战。为解决这一问题,我们提议了一个从特定观点分析目标对象或情景并选择下一个最佳处理视图的视图选择方法。我们的方法有一个基于强化学习的相机选择模块MVSelect,该模块不仅选择观点,而且还促进与任务网络的联合培训。多视分类和探测任务的实验结果显示,我们的方法在使用N现有观点中只有2或3个实现了有希望的业绩,显著降低了计算成本。此外,对选定观点的分析显示,某些相机可以关闭,而性能影响最小,对未来多视系统摄影机布局的优化亮光。代码可在https://github.com/hou-yz/MVSelect查阅。</s>