We provide an exercise suitable for early introduction in an undergraduate statistics or data science course that allows students to `play the whole game' of data science: performing both data collection and data analysis. While many teaching resources exist for data analysis, such resources are not as abundant for data collection given the inherent difficulty of the task. Our proposed exercise centers around student use of Google Calendar to collect data with the goal of answering the question `How do I spend my time?' On the one hand, the exercise involves answering a question with near universal appeal, but on the other hand, the data collection mechanism is not beyond the reach of a modal undergraduate student. A further benefit of this exercise is that it provides an opportunity for discussions on ethical questions and considerations that data providers and data analysts face in today's age of large-scale internet-based data collection.
翻译:我们提供了一个适合本科本科统计或数据科学课程早期引入的练习,使学生能够“玩上整个游戏”数据科学:既进行数据收集又进行数据分析。虽然有许多用于数据分析的教学资源,但由于任务的内在困难,这种资源并不足以收集数据。我们提议的围绕学生使用Google日历收集数据的练习中心,目的是回答“我如何花时间?” 的问题。一方面,这项工作涉及几乎普遍地回答一个问题,但另一方面,数据收集机制并非一个模范本科生所无法触及的。这项工作的另一个好处是,它为数据提供者和数据分析家在当今大规模互联网数据收集时代所面临的伦理问题和考虑提供了讨论机会。