Object localization has been a crucial task in computer vision field. Methods of localizing objects in an image have been proposed based on the features of the attended pixels. Recently researchers have proposed methods to formulate object localization as a dynamic decision process, which can be solved by a reinforcement learning approach. In this project, we implement a novel active object localization algorithm based on deep reinforcement learning. We compare two different action settings for this MDP: a hierarchical method and a dynamic method. We further perform some ablation studies on the performance of the models by investigating different hyperparameters and various architecture changes.
翻译:目标定位是计算机视觉领域的一项关键任务。 根据所参与的像素的特征,提出了图像对象定位方法。最近研究人员提出了将对象定位作为动态决策过程的方法,可以通过强化学习方法加以解决。在此项目中,我们实施了基于深层强化学习的新颖的积极对象定位算法。我们比较了该 MDP 的两个不同的动作设置:一种等级法和动态法。我们通过调查不同的超参数和各种结构变化,进一步对这些模型的性能进行了一些反差研究。