Understanding the interaction between different road users is critical for road safety and automated vehicles (AVs). Existing mathematical models on this topic have been proposed based mostly on either cognitive or machine learning (ML) approaches. However, current cognitive models are incapable of simulating road user trajectories in general scenarios, and ML models lack a focus on the mechanisms generating the behavior and take a high-level perspective which can cause failures to capture important human-like behaviors. Here, we develop a model of human pedestrian crossing decisions based on computational rationality, an approach using deep reinforcement learning (RL) to learn boundedly optimal behavior policies given human constraints, in our case a model of the limited human visual system. We show that the proposed combined cognitive-RL model captures human-like patterns of gap acceptance and crossing initiation time. Interestingly, our model's decisions are sensitive to not only the time gap, but also the speed of the approaching vehicle, something which has been described as a "bias" in human gap acceptance behavior. However, our results suggest that this is instead a rational adaption to human perceptual limitations. Moreover, we demonstrate an approach to accounting for individual differences in computational rationality models, by conditioning the RL policy on the parameters of the human constraints. Our results demonstrate the feasibility of generating more human-like road user behavior by combining RL with cognitive models.
翻译:理解不同道路使用者之间的互动对于道路安全和自动化车辆(AVs)至关重要。关于这个主题的现有数学模型主要基于认知或机器学习(ML)方法。然而,目前的认知模型无法模拟一般情况下的道路使用者轨迹,而ML模型缺乏对产生行为机制的关注,缺乏对高层次观点的敏感度,这可能导致无法捕捉重要的类似人类行为。在这里,我们开发了一个基于计算理性的人类行人跨行决定模型,这是一种使用深度强化学习(RL)的方法,以学习由于人类制约而具有约束性的最佳行为政策,我们的情况是有限的人类视觉系统模型。我们表明,拟议的综合认知-RL模型无法模拟一般情况下的道路使用者轨迹,无法模拟一般情况下的道路使用者轨迹,而ML模型缺乏对产生的行为模式的模拟;有趣的是,我们模型的决定不仅敏感时间差距,而且敏感于接近车辆的速度,在人类接受差距行为中被描述为“偏差”的一种方法。然而,我们的结果表明,这是理性地适应人类的认知局限性。此外,我们用一种方法将人类理性的逻辑模型结合了人类理性的计算方法,通过人类理性的判断结果,从而将人类的逻辑上的限制与人类的逻辑上的差异化的计算结果结合起来。