Numerous applications require robots to operate in environments shared with other agents such as humans or other robots. However, such shared scenes are typically subject to different kinds of long-term semantic scene changes. The ability to model and predict such changes is thus crucial for robot autonomy. In this work, we formalize the task of semantic scene variability estimation and identify three main varieties of semantic scene change: changes in the position of an object, its semantic state, or the composition of a scene as a whole. To represent this variability, we propose the Variable Scene Graph (VSG), which augments existing 3D Scene Graph (SG) representations with the variability attribute, representing the likelihood of discrete long-term change events. We present a novel method, DeltaVSG, to estimate the variability of VSGs in a supervised fashion. We evaluate our method on the 3RScan long-term dataset, showing notable improvements in this novel task over existing approaches. Our method DeltaVSG achieves a precision of 72.2% and recall of 66.8%, often mimicking human intuition about how indoor scenes change over time. We further show the utility of VSG predictions in the task of active robotic change detection, speeding up task completion by 62.4% compared to a scene-change-unaware planner. We make our code available as open-source.
翻译:有许多应用要求机器人在与人类或其他机器人等其它物剂共享的环境中操作。 但是,这种共享的场景通常会发生不同种类的长期语义场景变化。 因此,模型和预测这种变化的能力对于机器人的自主性至关重要。 在这项工作中,我们正式确定语义场景变异性估计的任务,并确定语义场景变化的三大主要种类:物体位置的变化、其语义状态或整个场景的构成。为了代表这种变异性,我们提议了变异性景色图(VSG),该图将现有的3D Scene Graph(SG)与变异性属性相加,代表离散的长期变化事件的可能性。我们提出了一个新的方法,即DeltaVSG(DelVSG),以监督的方式估计VSG(VSG)的变异性。我们评估了3RScan(长期数据集)的各种方法,显示现有方法的显著改进。我们的方法达到了72.2%的精确度和66.8%的回顾率,经常模拟人类直观如何改变室内场景色变化的情况,我们进一步展示了可用的VSG(Slental SG)的进度,我们比较了VSG(Oreval)的进度变换) 任务的进度。