With the recently increasing capabilities of modern vehicles, novel approaches for interaction emerged that go beyond traditional touch-based and voice command approaches. Therefore, hand gestures, head pose, eye gaze, and speech have been extensively investigated in automotive applications for object selection and referencing. Despite these significant advances, existing approaches mostly employ a one-model-fits-all approach unsuitable for varying user behavior and individual differences. Moreover, current referencing approaches either consider these modalities separately or focus on a stationary situation, whereas the situation in a moving vehicle is highly dynamic and subject to safety-critical constraints. In this paper, I propose a research plan for a user-centered adaptive multimodal fusion approach for referencing external objects from a moving vehicle. The proposed plan aims to provide an open-source framework for user-centered adaptation and personalization using user observations and heuristics, multimodal fusion, clustering, transfer-of-learning for model adaptation, and continuous learning, moving towards trusted human-centered artificial intelligence.
翻译:随着现代车辆能力最近不断增强,出现了超越传统的触摸和声音指挥方法的新的互动方法,因此,在汽车选择和参照物体的汽车应用中,对手势、头部姿势、眼视和语音进行了广泛调查。尽管取得了这些重大进步,但现有方法大多采用一模一样的通用方法,不适合不同的用户行为和个人差异。此外,目前的参考方法或者单独考虑这些模式,或者侧重于固定状态,而移动车辆的情况则高度动态,并受到安全方面的制约。在本文中,我提议了一项研究计划,以用户为中心的适应性多式融合方法来查找移动车辆的外部物体。拟议计划旨在提供一个开放源框架,利用用户的观察和超自然学、多式聚合、集群、用于模式调整的学习转移和持续学习,转向以人为中心的人工智能,供用户使用。