Recent organizations have started to adopt AI-based decision support tools to optimize human resource development practices, while facing various challenges of using AIs in highly contextual and sensitive domains. We present our case study that aims to help professional assessors make decisions in human assessment, in which they conduct interviews with assessees and evaluate their suitability for certain job roles. Our workshop with two industrial assessors elucidated troubles they face (i.e., maintaining stable and non-subjective observation of assessees' behaviors) and derived requirements of AI systems (i.e., extracting their nonverbal cues from interview videos in an interpretable manner). In response, we employed an unsupervised anomaly detection algorithm using multimodal behavioral features such as facial keypoints, body and head pose, and gaze. The algorithm extracts outlier scenes from the video based on behavioral features as well as informing which feature contributes to the outlierness. We first evaluated how the assessors would perceive the extracted cues and discovered that the algorithm is useful in suggesting scenes to which assessors would pay attention, thanks to its interpretability. Then, we developed an interface prototype incorporating the algorithm and had six assessors use it for their actual assessment. Their comments revealed the effectiveness of introducing unsupervised anomaly detection to enhance their feeling of confidence and objectivity of the assessment along with potential use scenarios of such AI-based systems in human assessment. Our approach, which builds on top of the idea of separating observation and interpretation in human-AI collaboration, will facilitate human decision making in highly contextual domains, such as human assessment, while keeping their trust in the system.
翻译:最近的一些组织开始采用以AI为基础的决策支持工具,以优化人力资源发展做法,同时面临在高度背景和敏感领域使用AI的各种挑战。我们介绍了我们的案例研究,目的是帮助专业评估员在人类评估中做出决策,与评估员进行面谈,并评估他们是否适合担任某些职务。我们与两个工业评估员一起举办的讲习班揭示了他们所面临的难题(即,保持对评估员行为的稳定和非主观观察)以及AI系统衍生的要求(即,以可解释的方式从访谈视频中提取非语言提示)。作为回应,我们采用了一种不受监督的异常检测算法,使用诸如面部关键点、身体和头部姿势等多式联运行为特征来帮助作出决策。算法从视频中提取了较外的场景,根据行为特征和特征说明了他们所面临的问题。我们首先评估员如何看待所选的线索,并发现算法有助于向评估员提供值得注意的场景。随后,我们开发了一种未经监督异常现象的观察算法,在人类评估中采用了一种界面原型,在评估中,在评估过程中,将这种原型的逻辑和六种性评估中,在评估中,在评估中,将展示了人类的高度的自我评估中,在评估中运用了人类的测算法中,在评估中,在评估中,并运用了人类测算法的测测算法中,并运用了人类的高度测测测变。