AI agents designed to collaborate with people benefit from models that enable them to anticipate human behavior. However, realistic models tend to require vast amounts of human data, which is often hard to collect. A good prior or initialization could make for more data-efficient training, but what makes for a good prior on human behavior? Our work leverages a very simple assumption: people generally act closer to optimal than to random chance. We show that using optimal behavior as a prior for human models makes these models vastly more data-efficient and able to generalize to new environments. Our intuition is that such a prior enables the training to focus one's precious real-world data on capturing the subtle nuances of human suboptimality, instead of on the basics of how to do the task in the first place. We also show that using these improved human models often leads to better human-AI collaboration performance compared to using models based on real human data alone.
翻译:旨在与人们合作的AI代理商从能够使他们预测人类行为的模型中受益。 但是,现实模型往往需要大量的人类数据,而这些数据往往很难收集。 良好的前期或初始化可以提供数据效率更高的培训,但是在人类行为之前又能带来什么好处? 我们的工作利用了一个非常简单的假设:人们通常比随机机会更接近于最佳行为。我们证明,人类模型使用最佳行为前期使这些模型在数据效率上大大提高,并且能够向新的环境推广。我们的直觉是,这样的先期培训使得一个人的宝贵真实世界数据能够集中于捕捉人类亚性微妙的细微差别,而不是首先研究如何完成这项任务的基本原理。我们还表明,使用这些经过改进的人类模型往往能够提高人类-AI合作的绩效,而不是仅仅使用基于实际人类数据的模型。