As the production of and reliance on datasets to produce automated decision-making systems (ADS) increases, so does the need for processes for evaluating and interrogating the underlying data. After launching the Dataset Nutrition Label in 2018, the Data Nutrition Project has made significant updates to the design and purpose of the Label, and is launching an updated Label in late 2020, which is previewed in this paper. The new Label includes context-specific Use Cases &Alerts presented through an updated design and user interface targeted towards the data scientist profile. This paper discusses the harm and bias from underlying training data that the Label is intended to mitigate, the current state of the work including new datasets being labeled, new and existing challenges, and further directions of the work, as well as Figures previewing the new label.
翻译:数据营养项目在2018年启动数据集营养标签后,对标签的设计和目的进行了重大更新,并将于2020年底启动更新标签,本文件对此进行了预览。新的标签包括了针对数据科学家简介的更新设计和用户界面中提供的特定背景使用案例和提示。本文讨论了标签旨在减轻的基本培训数据的伤害和偏差、包括标签上的新数据集在内的工作现状、新的和现有的挑战、工作的进一步方向以及新标签的预览图。