The Rashomon effect occurs when many different explanations exist for the same phenomenon. In machine learning, Leo Breiman used this term to characterize problems where many accurate-but-different models exist to describe the same data. In this work, we study how the Rashomon effect can be useful for understanding the relationship between training and test performance, and the possibility that simple-yet-accurate models exist for many problems. We consider the Rashomon set - the set of almost-equally-accurate models for a given problem - and study its properties and the types of models it could contain. We present the Rashomon ratio as a new measure related to simplicity of model classes, which is the ratio of the volume of the set of accurate models to the volume of the hypothesis space; the Rashomon ratio is different from standard complexity measures from statistical learning theory. For a hierarchy of hypothesis spaces, the Rashomon ratio can help modelers to navigate the trade-off between simplicity and accuracy. In particular, we find empirically that a plot of empirical risk vs. Rashomon ratio forms a characteristic $\Gamma$-shaped Rashomon curve, whose elbow seems to be a reliable model selection criterion. When the Rashomon set is large, models that are accurate - but that also have various other useful properties - can often be obtained. These models might obey various constraints such as interpretability, fairness, or monotonicity.
翻译:Rashomon效应发生于对同一现象存在多种不同解释时。 在机器学习中, Leo Breiman 使用这个术语来描述存在许多准确但不同模型的问题,以描述相同数据。 在这项工作中,我们研究Rashomon效应如何有助于理解培训和测试性能之间的关系,以及存在简单但非准确模型的可能性。 我们认为,Rashomon 集 — — 即对某个特定问题来说几乎是平等的精确模型集 — — 并研究其属性和它可能包含的模型类型。我们提出Rashomon比率作为与模型类的简单性相关的新衡量标准,即精确模型数量与假设空间数量之比;Rashomon 比率与统计学习理论的标准复杂度不同。对于假设空间的等级,Rashomon比率可以帮助模型在简单和准确性之间实现交易。我们特别发现,根据经验,Rashomon 比率构成一个特征 $Gammam$-smaility syality experformation 标准,这些模型可能具有其他的可靠性。