Known-item video search is effective with human-in-the-loop to interactively investigate the search result and refine the initial query. Nevertheless, when the first few pages of results are swamped with visually similar items, or the search target is hidden deep in the ranked list, finding the know-item target usually requires a long duration of browsing and result inspection. This paper tackles the problem by reinforcement learning, aiming to reach a search target within a few rounds of interaction by long-term learning from user feedbacks. Specifically, the system interactively plans for navigation path based on feedback and recommends a potential target that maximizes the long-term reward for user comment. We conduct experiments for the challenging task of video corpus moment retrieval (VCMR) to localize moments from a large video corpus. The experimental results on TVR and DiDeMo datasets verify that our proposed work is effective in retrieving the moments that are hidden deep inside the ranked lists of CONQUER and HERO, which are the state-of-the-art auto-search engines for VCMR.
翻译:然而,当头几页结果被视觉相似的物品淹没,或者搜索目标被隐藏在排名列表的深处时,发现知道的项目目标通常需要很长的浏览和结果检查时间。本文通过强化学习来解决这个问题,目的是通过从用户反馈中长期学习,在几轮互动中达到搜索目标。具体地说,该系统基于反馈的互动式导航路径计划,并建议一个对用户评论给予最大程度的长期奖励的潜在目标。我们进行实验,以开展具有挑战性的视频程序时间检索(VCMR)任务,将大型视频机库中的片段本地化。TVR和DiDeMotset的实验结果证实,我们拟议的工作对于重新探索CONQUER和HERO排名列表中隐藏的深层时刻是有效的,而后者是VCMR和DEMet的先进自动搜索引擎。