Machine learning (ML) algorithms are gaining increased importance in many academic and industrial applications, and such algorithms are, accordingly, becoming common components in computer science curricula. Learning ML is challenging not only due to its complex mathematical and algorithmic aspects, but also due to a) the complexity of using correctly these algorithms in the context of real-life situations and b) the understanding of related social and ethical issues. Cognitive biases are phenomena of the human brain that may cause erroneous perceptions and irrational decision-making processes. As such, they have been researched thoroughly in the context of cognitive psychology and decision making; they do, however, have important implications for computer science education as well. One well-known cognitive bias, first described by Kahneman and Tversky, is the base rate neglect bias, according to which humans fail to consider the base rate of the underlying phenomena when evaluating conditional probabilities. In this paper, we explore the expression of the base rate neglect bias in ML education. Specifically, we show that about one third of students in an Introduction to ML course, from varied backgrounds (computer science students and teachers, data science, engineering, social science and digital humanities), fail to correctly evaluate ML algorithm performance due to the base rate neglect bias. This failure rate should alert educators and promote the development of new pedagogical methods for teaching ML algorithm performance.
翻译:在许多学术和工业应用中,机器学习(ML)算法越来越重要,因此,这种算法正在成为计算机科学课程的共同组成部分。 学习ML不仅由于其复杂的数学和算法方面,而且由于以下原因,学习ML具有挑战性:在现实生活中正确使用这些算法的复杂性;以及(b)对相关社会和伦理问题的理解。认知偏差是人类大脑的现象,可能造成错误的观念和不合理的决策过程。因此,这些算法在认知心理学和决策中得到了彻底的研究;但是,它们对计算机科学教育也有重要的影响。首先由Kahneman和Tversky描述的一个众所周知的认知偏差是基本比率忽视偏差,根据这种偏差,在评估有条件的概率时,人类无法考虑基本现象的基本比率。我们在本文件中探讨基本比率忽略ML教育中的偏差偏差偏差。我们从不同的背景(计算机科学学生和教师、数据科学、工程学、社会学和数字人文学研究课程的入门课程中,大约三分之一的学生从不同的背景(计算机科学学生、数据学教师、工程学、社会学和数学学前程的偏差率率,应当评估ML的学习的学习不正确分析方法)到ML的成绩分析方法的不正确评价。