We present AutoDOViz, an interactive user interface for automated decision optimization (AutoDO) using reinforcement learning (RL). Decision optimization (DO) has classically being practiced by dedicated DO researchers where experts need to spend long periods of time fine tuning a solution through trial-and-error. AutoML pipeline search has sought to make it easier for a data scientist to find the best machine learning pipeline by leveraging automation to search and tune the solution. More recently, these advances have been applied to the domain of AutoDO, with a similar goal to find the best reinforcement learning pipeline through algorithm selection and parameter tuning. However, Decision Optimization requires significantly more complex problem specification when compared to an ML problem. AutoDOViz seeks to lower the barrier of entry for data scientists in problem specification for reinforcement learning problems, leverage the benefits of AutoDO algorithms for RL pipeline search and finally, create visualizations and policy insights in order to facilitate the typical interactive nature when communicating problem formulation and solution proposals between DO experts and domain experts. In this paper, we report our findings from semi-structured expert interviews with DO practitioners as well as business consultants, leading to design requirements for human-centered automation for DO with RL. We evaluate a system implementation with data scientists and find that they are significantly more open to engage in DO after using our proposed solution. AutoDOViz further increases trust in RL agent models and makes the automated training and evaluation process more comprehensible. As shown for other automation in ML tasks, we also conclude automation of RL for DO can benefit from user and vice-versa when the interface promotes human-in-the-loop.
翻译:我们展示了AutoDOiz,这是利用强化学习(RL)进行自动决策优化(AutoDO)的互动式用户界面。决定优化(DO)由专职的DO研究人员在专家需要花费长时间时间通过试试和试调整解决方案的情况下,传统地实践。自动ML管道搜索使数据科学家更容易通过利用自动化搜索和调整解决方案找到最佳的机器学习管道。最近,这些进步应用于AutoDO领域,其相似的目标是通过算法选择和参数调控找到最佳的强化学习管道。然而,决定优化(DO)要求与ML问题相比,需要更复杂的用户界面定义。AutoDOViz力求降低数据科学家进入问题规格的障碍,以便强化学习问题,利用AutoDO的算法为RL管道搜索和调整解决方案提供优势,创造视觉化和政策洞察力,以便在DO-L专家与域专家交流问题模型的制定和解决方案时,我们从半结构化专家访谈中获得的结果,作为商业顾问,在DO-L软件的自动化后,我们通过D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-S-D-D-D-D-D-D-D-D-D-D-D-S-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D