The advent of industrial robotics and autonomous systems endow human-robot collaboration in a massive scale. However, current industrial robots are restrained in co-working with human in close proximity due to inability of interpreting human agents' attention. Human attention study is non-trivial since it involves multiple aspects of the mind: perception, memory, problem solving, and consciousness. Human attention lapses are particularly problematic and potentially catastrophic in industrial workplace, from assembling electronics to operating machines. Attention is indeed complex and cannot be easily measured with single-modality sensors. Eye state, head pose, posture, and manifold environment stimulus could all play a part in attention lapses. To this end, we propose a pipeline to annotate multimodal dataset of human attention tracking, including eye tracking, fixation detection, third-person surveillance camera, and sound. We produce a pilot dataset containing two fully annotated phone assembly sequences in a realistic manufacturing environment. We evaluate existing fatigue and drowsiness prediction methods for attention lapse detection. Experimental results show that human attention lapses in production scenarios are more subtle and imperceptible than well-studied fatigue and drowsiness.
翻译:工业机器人和自主系统的出现使得在人机协作方面取得了巨大的进展。然而,由于当前工业机器人无法解释人类代理人的注意力,因此其无法与人类在近距离内共同工作。人类注意力研究是非常复杂的,因为它涉及到大脑的多个方面:知觉、记忆、问题解决和意识等等。在工业场所中,人类注意力偏移特别棘手,有可能导致灾难的发生,从电子产品组装到机器操作都有可能发生。注意力确实是一个复杂的问题,不能仅仅通过单模态传感器来衡量。眼部状态、头部姿态、姿势和多样的环境刺激都可能导致注意力偏移发生。为此,我们提出了一个流程来注释多模态数据集中的人类注意力跟踪信息,包括眼动跟踪、凝视检测、第三人称监控摄像头和声音等方面。我们制作了一个试验数据集,其中包含了在一个逼真的制造环境中完成的两个完全注释的手机组装序列。我们评估了现有的疲劳和嗜睡预测方法来检测注意力偏移。实验结果表明,在生产场景下人类注意力偏移比已知的疲劳和嗜睡更加微妙和难以察觉。