Road accidents involving autonomous vehicles commonly occur in situations where a (pedestrian) obstacle presents itself in the path of the moving vehicle at very sudden time intervals, leaving the robot even lesser time to react to the change in scene. In order to tackle this issue, we propose a novel algorithmic implementation that classifies the intent of a single arbitrarily chosen pedestrian in a two dimensional frame into logic states in a procedural manner using quaternions generated from a MediaPipe pose estimation model. This bypasses the need to employ any relatively high latency deep-learning algorithms primarily due to the lack of necessity for depth perception as well as an implicit cap on the computational resources that most IoT edge devices present. The model was able to achieve an average testing accuracy of 83.56% with a reliable variance of 0.0042 while operating with an average latency of 48 milliseconds, demonstrating multiple notable advantages over the current standard of using spatio-temporal convolutional networks for these perceptive tasks.
翻译:针对自主驾驶车辆在遇到突然出现的行人障碍物时的应对不及时,我们提出了一种新颖的算法实现方式,通过基于MediaPipe姿势估计模型生成的四元数,以过程化的方式将二维框架中单个行人的意图分类为逻辑状态。这种算法实现方式无需使用任何相对较高延迟的深度学习算法,主要是由于深度感知的不必要以及大多数物联网边缘设备计算资源上的隐式限制。该模型在操作时的平均延迟为48毫秒,并且能够实现平均测试精度83.56%,方差可靠性为0.0042,表现出多个值得注意的优点,相较于目前使用时空卷积网络进行感知任务的标准。