Software architectures for conversational robots typically consist of multiple modules, each designed for a particular processing task or functionality. Some of these modules are developed for the purpose of making decisions about the next action that the robot ought to perform in the current context. Those actions may relate to physical movements, such as driving forward or grasping an object, but may also correspond to communicative acts, such as asking a question to the human user. In this position paper, we reflect on the organization of those decision modules in human-robot interaction platforms. We discuss the relative benefits and limitations of modular vs. end-to-end architectures, and argue that, despite the increasing popularity of end-to-end approaches, modular architectures remain preferable when developing conversational robots designed to execute complex tasks in collaboration with human users. We also show that most practical HRI architectures tend to be either robot-centric or dialogue-centric, depending on where developers wish to place the ``command center'' of their system. While those design choices may be justified in some application domains, they also limit the robot's ability to flexibly interleave physical movements and conversational behaviours. We contend that architectures placing ``action managers'' and ``interaction managers'' on an equal footing may provide the best path forward for future human-robot interaction systems.
翻译:用于谈话机器人的软件架构通常由多个模块组成, 每个都是为特定处理任务或功能设计的。 其中一些模块的开发是为了就机器人在当前环境下应该执行的下一个动作做出决策。 这些行动可能与物理运动有关, 比如前进或掌握一个对象, 但也可能与通信行为相对应, 比如向人类用户提问。 在这份立场文件中, 我们思考这些决定模块在人类- 机器人互动平台中的组织方式。 我们讨论模块相对于终端到终端结构的相对好处和局限性, 并争论说, 尽管最终到终端方法越来越受欢迎, 模块架构在开发旨在与人类用户合作执行复杂任务的谈话机器人时仍然更可取。 我们还表明, 最实用的HRI架构往往要么以机器人为中心, 要么以对话为中心, 取决于开发者希望将其系统“ 指令中心” 置于何处。 虽然这些设计选择在某些应用领域可能是合理的, 但是它们也限制了机器人灵活互换物理运动和对话行为的能力。 我们主张, 未来管理者可以提供一条最先进的路径, 并且能够提供未来互动系统。</s>