Workers spend a significant amount of time learning how to make good decisions. Evaluating the efficacy of a given decision, however, can be complicated -- e.g., decision outcomes are often long-term and relate to the original decision in complex ways. Surprisingly, even though learning good decision-making strategies is difficult, they can often be expressed in simple and concise forms. Focusing on sequential decision-making, we design a novel machine learning algorithm that is capable of extracting "best practices" from trace data and conveying its insights to humans in the form of interpretable "tips". Our algorithm selects the tip that best bridges the gap between the actions taken by the human workers and those taken by the optimal policy in a way that accounts for which actions are consequential for achieving higher performance. We evaluate our approach through a series of randomized controlled experiments where participants manage a virtual kitchen. Our experiments show that the tips generated by our algorithm can significantly improve human performance relative to intuitive baselines. In addition, we discuss a number of empirical insights that can help inform the design of algorithms intended for human-AI interfaces. For instance, we find evidence that participants do not simply blindly follow our tips; instead, they combine them with their own experience to discover additional strategies for improving performance.
翻译:工人花费大量时间学习如何做出正确的决定。但是,评估某项决定的功效可能很复杂,例如,决策结果往往是长期的,并且以复杂的方式与最初的决定相关。令人惊讶的是,尽管学习良好的决策战略是困难的,但往往可以用简单简洁的形式表达。侧重于顺序决策,我们设计了一种新的机器学习算法,它能够从跟踪数据中提取“最佳做法”并以可解释的“提示”的形式向人类传达其洞察力。我们的算法选择了一条线索,以最佳方式弥合人类工人所采取行动与最佳政策所采取行动之间的差距,从而说明行动对取得更高绩效的影响。我们通过一系列随机化的控制实验来评估我们的方法,参与者管理虚拟厨房。我们的实验表明,我们的算法所产生的提示可以大大改善人类相对于直观基线的绩效。此外,我们讨论了一些经验性洞察力,可以帮助为人类-AI接口设计各种算法。举例说,我们发现,参与者们并不盲目地学习他们自己的战略,而是盲目地学习他们自己的战略。