The ability to develop a high-level understanding of a scene, such as perceiving danger levels, can prove valuable in planning multi-robot search and rescue (SaR) missions. In this work, we propose to uniquely leverage natural language descriptions from the mission commander in chief and image data captured by robots to estimate scene danger. Given a description and an image, a state-of-the-art deep neural network is used to assess a corresponding similarity score, which is then converted into a probabilistic distribution of danger levels. Because commonly used visio-linguistic datasets do not represent SaR missions well, we collect a large-scale image-description dataset from synthetic images taken from realistic disaster scenes and use it to train our machine learning model. A risk-aware variant of the Multi-robot Efficient Search Path Planning (MESPP) problem is then formulated to use the danger estimates in order to account for high-risk locations in the environment when planning the searchers' paths. The problem is solved via a distributed approach based on Mixed-Integer Linear Programming. Our experiments demonstrate that our framework allows to plan safer yet highly successful search missions, abiding to the two most important aspects of SaR missions: to ensure both searchers' and victim safety.
翻译:在这项工作中,我们提议以独特的方式利用飞行任务指挥官的自然语言描述和机器人获取的图像数据来估计现场危险。根据描述和图像,一个最先进的深层神经网络用来评估相应的相似性分数,然后转换成危险等级的概率分布。由于常用的比喻语言数据集并不代表萨拉姆飞行任务,我们从现实的灾难场景中采集的合成图像中收集一个大型图像描述数据集,并用来培训我们的机器学习模型。然后,设计一个多机器人高效搜索路径规划(MESPPP)问题的风险认知变体,以便利用危险估计值来计算环境中高风险地点在规划搜索者路径时的概率分布。问题通过基于混合 Inter Linear 编程的分布式方法来解决。我们的实验表明,我们的框架可以安全地规划两个安全但非常成功的搜索任务。