深度强化学习的一种新颖的运动线索算法用于车辆驾驶仿真 (A novel approach of a deep reinforcement learning based motion cueing algorithm for vehicle driving simulation)

In the field of motion simulation, the level of immersion strongly depends on the motion cueing algorithm (MCA), as it transfers the reference motion of the simulated vehicle to a motion of the motion simulation platform (MSP). The challenge for the MCA is to reproduce the motion perception of a real vehicle driver as accurately as possible without exceeding the limits of the workspace of the MSP in order to provide a realistic virtual driving experience. In case of a large discrepancy between the perceived motion signals and the optical cues, motion sickness may occur with the typical symptoms of nausea, dizziness, headache and fatigue. Existing approaches either produce non-optimal results, e.g., due to filtering, linearization, or simplifications, or the required computational time exceeds the real-time requirements of a closed-loop application. In this work a new solution is presented, where not a human designer specifies the principles of the MCA but an artificial intelligence (AI) learns the optimal motion by trial and error in an interaction with the MSP. To achieve this, deep reinforcement learning (RL) is applied, where an agent interacts with an environment formulated as a Markov decision process~(MDP). This allows the agent to directly control a simulated MSP to obtain feedback on its performance in terms of platform workspace usage and the motion acting on the simulator user. The RL algorithm used is proximal policy optimization (PPO), where the value function and the policy corresponding to the control strategy are learned and both are mapped in artificial neural networks (ANN). This approach is implemented in Python and the functionality is demonstrated by the practical example of pre-recorded lateral maneuvers. The subsequent validation on a standardized double lane change shows that the RL algorithm is able to learn the control strategy and improve the quality of...

翻译：在运动模拟领域，沉浸度强烈依赖于运动线索算法（MCA），因为它将模拟车辆的参考运动转换为运动模拟平台（MSP）的运动。 MCA的挑战是在不超出MSP工作空间的限制的情况下尽可能准确地再现真实车辆驾驶员的运动感知，以提供逼真的虚拟驾驶体验。如果感知的运动信号与光学线索之间存在很大差异，则可能会出现晕动病，其典型症状为恶心，头晕，头痛和疲劳。现有方法要么产生非最佳结果，例如由于过滤，线性化或简化而导致，要么所需计算时间超出了封闭环应用的实时要求。在本文中，提出了一种新的解决方案，其中人工智能（AI）学习最优运动，而不是由人类设计者指定MCA的原则，通过与MSP的交互进行试错。为了实现这一目标，采用了深度强化学习（RL），其中代理与Markov再决策过程（MDP）形式化的环境相互作用。这使代理能够直接控制模拟MSP，以获得有关其性能的反馈，例如平台工作空间使用和作用于模拟器用户的运动。使用的RL算法是Proximal Policy Optimization（PPO），其中学习价值函数和相应控制策略的政策，两者均映射在人工神经网络（ANN）中。这种方法是用Python实现的，通过预先记录的横向机动的实际例子来演示其功能。随后在标准双车道变道的验证中表明，RL算法能够学习控制策略并改善...