Given thousands of equally accurate machine learning (ML) models, how can users choose among them? A recent ML technique enables domain experts and data scientists to generate a complete Rashomon set for sparse decision trees--a huge set of almost-optimal interpretable ML models. To help ML practitioners identify models with desirable properties from this Rashomon set, we develop TimberTrek, the first interactive visualization system that summarizes thousands of sparse decision trees at scale. Two usage scenarios highlight how TimberTrek can empower users to easily explore, compare, and curate models that align with their domain knowledge and values. Our open-source tool runs directly in users' computational notebooks and web browsers, lowering the barrier to creating more responsible ML models. TimberTrek is available at the following public demo link: https://poloclub.github.io/timbertrek.
翻译:鉴于千千万万的同样准确的机器学习模式,用户可以如何在其中作出选择?最近的ML技术让域内专家和数据科学家能够生成完整的Rashomon数据集,用于稀树稀树(一套几乎最理想的可解释ML模型)。为了帮助ML从业者从Rashomon数据集中找出具有适当属性的模型,我们开发了第一个互动可视化系统TrimberTrek,该系统将大规模地汇总成数千个稀树。两个使用情景突出显示,TyalTrek能够如何使用户能够轻松地探索、比较和整理与其域内知识和价值相一致的模型。我们的开放源工具直接运行在用户的计算笔记和网络浏览器中,降低障碍以创建更负责任的ML模型。TreeTrek可以在以下公共演示链接上找到:https://poloclub.github.io/timbertrek。