Machine learning (ML) algorithms are gaining increased importance in many academic and industrial applications, and such algorithms are, accordingly, becoming common components in computer science curricula. Learning ML is challenging not only due to its complex mathematical and algorithmic aspects, but also due to a) the complexity of using correctly these algorithms in the context of real-life situations and b) the understanding of related social and ethical issues. Cognitive biases are phenomena of the human brain that may cause erroneous perceptions and irrational decision-making processes. As such, they have been researched thoroughly in the context of cognitive psychology and decision making; they do, however, have important implications for computer science education as well. One well-known cognitive bias, first described by Kahneman and Tversky, is the base rate neglect bias, according to which humans fail to consider the base rate of the underlaying phenomena when evaluating conditional probabilities. In this paper, we explore the expression of the base rate neglect bias in ML education. Specifically, we show that about one third of students in an Introduction to ML course, from varied backgrounds (computer science students and teachers, data science, engineering, social science and digital humanities), fail to correctly evaluate ML algorithm performance due to the base rate neglect bias. This failure rate should alert educators and promote the development of new pedagogical methods for teaching ML algorithm performance.
翻译:在许多学术和工业应用中,机器学习(ML)算法越来越重要,因此,这种算法正在成为计算机科学课程的共同组成部分。 学习ML不仅由于其复杂的数学和算法方面,而且由于以下原因,学习ML具有挑战性:在现实生活中正确使用这些算法的复杂性;以及(b)对相关社会和伦理问题的理解。认知偏差是人类大脑的现象,可能引起错误的观念和不合理的决策过程。因此,这些算法在认知心理学和决策中得到了彻底的研究;但是,它们对计算机科学教育也有重要的影响。首先由Kahneman和Tversky描述的一个众所周知的认知偏差是基本比率忽视偏差,根据这种偏差,在评估有条件的概率时,人们无法考虑底线现象的基本比率。在本文中,我们探讨了在ML教育中基本比率忽略偏差的偏差表现偏差。我们从不同的背景(计算机科学学生和教师、数据科学、工程学、社会科学和人文学和人文学的演算方法)中,大约三分之一的学生在ML课程的入门课程中,从不同的背景(计算机科学和教师、数据学学生、数据偏差率、社会学和M级演算方法的失败,应该正确评价新学的学习的成绩失、M的学习的学习的学习不差差率。