Misalignment between model predictions and intended usage can be detrimental for the deployment of computer vision models. The issue is exacerbated when the task involves complex structured outputs, as it becomes harder to design procedures which address this misalignment. In natural language processing, this is often addressed using reinforcement learning techniques that align models with a task reward. We adopt this approach and show its surprising effectiveness across multiple computer vision tasks, such as object detection, panoptic segmentation, colorization and image captioning. We believe this approach has the potential to be widely useful for better aligning models with a diverse range of computer vision tasks.
翻译:模型预测和预定用途之间的不一致可能不利于部署计算机视觉模型。当任务涉及复杂的结构化产出时,这一问题就更加严重,因为更难设计处理这种不匹配的程序。在自然语言处理中,往往使用使模型与任务奖励相一致的强化学习技术解决这个问题。我们采用这种方法,并显示其在多种计算机视觉任务中的惊人效果,如物体探测、全视分解、彩色化和图像说明。我们认为,这一方法对于更好地使模型与各种计算机视觉任务相匹配,可能大有裨益。