Online continual learning from data streams in dynamic environments is a critical direction in the computer vision field. However, realistic benchmarks and fundamental studies in this line are still missing. To bridge the gap, we present a new online continual object detection benchmark with an egocentric video dataset, Objects Around Krishna (OAK). OAK adopts the KrishnaCAM videos, an ego-centric video stream collected over nine months by a graduate student. OAK provides exhaustive bounding box annotations of 80 video snippets (~17.5 hours) for 105 object categories in outdoor scenes. The emergence of new object categories in our benchmark follows a pattern similar to what a single person might see in their day-to-day life. The dataset also captures the natural distribution shifts as the person travels to different places. These egocentric long-running videos provide a realistic playground for continual learning algorithms, especially in online embodied settings. We also introduce new evaluation metrics to evaluate the model performance and catastrophic forgetting and provide baseline studies for online continual object detection. We believe this benchmark will pose new exciting challenges for learning from non-stationary data in continual learning. The OAK dataset and the associated benchmark are released at https://oakdata.github.io/.
翻译:在动态环境中从数据流中不断在线学习动态环境中的数据流是计算机视野领域的一个关键方向。 但是,这条线上仍然缺少现实的基准和基础研究。 为了缩小差距,我们展示了一个新的在线连续物体探测基准,其视频数据集以自我为中心,即Krishna(OAK)。 OAK采用了KrishnaCAM视频,这是研究生在九个月中收集的以自我为中心的视频流。 OAK为室外场景105个对象类别提供了80个视频片段(~17.5小时)的详尽的捆绑框说明。我们的基准中新对象类别的出现遵循了与一个人在日常生活中可能看到的情况相似的模式。数据集还记录了个人前往不同地点时的自然分布变化。这些以自我为中心的长期视频为持续学习算法提供了现实的游乐场,特别是在网上装饰式环境中。我们还引入了新的评价指标,以评价模型性能和灾难性的遗忘,并为在线持续物体探测提供基线研究。我们认为,这一基准将为学习非静止数据带来新的令人兴奋的挑战。 OAK数据设置和相关的基准将在 http://httpslibgio/tadata。