Developing human-like conversational agents is a prime area in HCI research and subsumes many tasks. Predicting listener backchannels is one such actively-researched task. While many studies have used different approaches for backchannel prediction, they all have depended on manual annotations for a large dataset. This is a bottleneck impacting the scalability of development. To this end, we propose using semi-supervised techniques to automate the process of identifying backchannels, thereby easing the annotation process. To analyze our identification module's feasibility, we compared the backchannel prediction models trained on (a) manually-annotated and (b) semi-supervised labels. Quantitative analysis revealed that the proposed semi-supervised approach could attain 95% of the former's performance. Our user-study findings revealed that almost 60% of the participants found the backchannel responses predicted by the proposed model more natural. Finally, we also analyzed the impact of personality on the type of backchannel signals and validated our findings in the user-study.
翻译:开发像人类的谈话剂是 HCI 研究和分流许多任务的主要领域。 预测收听器回路是这种积极研究的任务之一。 虽然许多研究对后通道预测采用了不同的方法, 但它们都依赖于大型数据集的人工说明。 这是影响发展可缩缩的瓶颈。 为此, 我们建议使用半监督技术将查找后通道的过程自动化, 从而方便说明过程。 为了分析我们的识别模块的可行性, 我们比较了在(a) 手动附加说明和(b) 半监督标签上培训的后通道预测模型。 定量分析显示, 提议的半监督方法可以达到前一个数据集95%的性能。 我们的用户研究结果显示, 近60%的参与者发现拟议模型预测的后通道反应是自然的。 最后, 我们还分析了个性对后通道信号类型的影响, 并在用户研究中验证了我们的调查结果。