In the real world, autonomous driving agents navigate in highly dynamic environments full of unexpected situations where pre-trained models are unreliable. In these situations, what is immediately available to vehicles is often only human operators. Empowering autonomous driving agents with the ability to navigate in a continuous and dynamic environment and to communicate with humans through sensorimotor-grounded dialogue becomes critical. To this end, we introduce Dialogue On the ROad To Handle Irregular Events (DOROTHIE), a novel interactive simulation platform that enables the creation of unexpected situations on the fly to support empirical studies on situated communication with autonomous driving agents. Based on this platform, we created the Situated Dialogue Navigation (SDN), a navigation benchmark of 183 trials with a total of 8415 utterances, around 18.7 hours of control streams, and 2.9 hours of trimmed audio. SDN is developed to evaluate the agent's ability to predict dialogue moves from humans as well as generate its own dialogue moves and physical navigation actions. We further developed a transformer-based baseline model for these SDN tasks. Our empirical results indicate that language guided-navigation in a highly dynamic environment is an extremely difficult task for end-to-end models. These results will provide insight towards future work on robust autonomous driving agents. The DOROTHIE platform, SDN benchmark, and code for the baseline model are available at https://github.com/sled-group/DOROTHIE.
翻译:在现实世界中,自主驾驶者在高度动态的环境中航行,充满了事先训练过的模型不可靠的意外情况。在这些情况下,车辆可以立即得到的往往是人类操作者。赋予自主驾驶者权力,使其能够在连续和动态环境中航行,并通过感官模版对话与人类沟通。为此,我们推出一个新型互动模拟平台,即“关于处理不规则事件的对话”,这是一个新的互动模拟平台,可以在空中制造出意外情况,以支持对与自主驾驶者进行定位通信的经验性研究。根据这个平台,我们创建了“静地对话导航”(SDN),这是183项试验的导航基准,共有8415个单词,大约18.7小时的控制流和2.9小时的微调音频。SDN是用来评估代理人预测对话从人类进行的能力,并产生自己的对话动作和实际导航行动。我们为SDN的任务进一步开发了一个基于变换的基线模型。我们的经验显示,在高度动态的国际环境中,以引导导航者导航者导航员导航导航(SDGRA)将提供一个非常困难的基线平台,用于SDFDB-T-BB-B-G-LADFDFDF-T-T-G-G-G-G-L-G-G-T-G-FD-G-T-T-FD-G-G-FD-G-G-G-G-G-G-G-G-G-G-FD-G-G-G-G-G-G-F-G-G-G-G-G-G-FD-FD-FD-FD-FD-FD-FD-FD-L-L-FD-L-L-L-L-L-L-F-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-