Data Science is a complex and evolving field, but most agree that it can be defined as a combination of expertise drawn from three broad areascomputer science and technology, math and statistics, and domain knowledge -- with the purpose of extracting knowledge and value from data. Beyond this, the field is often defined as a series of practical activities ranging from the cleaning and wrangling of data, to its analysis and use to infer models, to the visual and rhetorical representation of results to stakeholders and decision-makers. This essay proposes a model of data science that goes beyond laundry-list definitions to get at the specific nature of data science and help distinguish it from adjacent fields such as computer science and statistics. We define data science as an interdisciplinary field comprising four broad areas of expertise: value, design, systems, and analytics. A fifth area, practice, integrates the other four in specific contexts of domain knowledge. We call this the 4+1 model of data science. Together, these areas belong to every data science project, even if they are often unconnected and siloed in the academy.
翻译:暂无翻译