Safe manipulation in unstructured environments for service robots is a challenging problem. A failure detection system is needed to monitor and detect unintended outcomes. We propose FINO-Net, a novel multimodal sensor fusion based deep neural network to detect and identify manipulation failures. We also introduce a multimodal dataset, containing 229 real-world manipulation data recorded with a Baxter robot. Our network combines RGB, depth and audio readings to effectively detect and classify failures. Results indicate that fusing RGB with depth and audio modalities significantly improves the performance. FINO-Net achieves 98.60% detection and 87.31% classification accuracy on our novel dataset. Code and data are publicly available at https://github.com/ardai/fino-net.
翻译:在非结构化环境中对服务机器人进行安全操作是一个具有挑战性的问题。需要有一个故障检测系统来监测和检测意外结果。我们提议建立基于深神经网络的新型多式联运传感器聚合网络FINO-Net,以检测和识别操纵失败。我们还引入了一个多式数据集,其中包含与巴克斯特机器人一起记录的229个真实世界操纵数据。我们的网络将RGB、深度和音频读数结合起来,以有效检测和分类故障。结果显示,以深度和音频模式对 RGB进行引信操作,可大大改善性能。FINO-Net在我们的新型数据集上实现了98.60%的检测和87.31%的分类准确性。代码和数据可在https://github.com/ardai/fino-net上公开查阅。