最终至最终机器学习中负责任的AI挑战 (Responsible AI Challenges in End-to-end Machine Learning)

Responsible AI is becoming critical as AI is widely used in our everyday lives. Many companies that deploy AI publicly state that when training a model, we not only need to improve its accuracy, but also need to guarantee that the model does not discriminate against users (fairness), is resilient to noisy or poisoned data (robustness), is explainable, and more. In addition, these objectives are not only relevant to model training, but to all steps of end-to-end machine learning, which include data collection, data cleaning and validation, model training, model evaluation, and model management and serving. Finally, responsible AI is conceptually challenging, and supporting all the objectives must be as easy as possible. We thus propose three key research directions towards this vision - depth, breadth, and usability - to measure progress and introduce our ongoing research. First, responsible AI must be deeply supported where multiple objectives like fairness and robust must be handled together. To this end, we propose FR-Train, a holistic framework for fair and robust model training in the presence of data bias and poisoning. Second, responsible AI must be broadly supported, preferably in all steps of machine learning. Currently we focus on the data pre-processing steps and propose Slice Tuner, a selective data acquisition framework for training fair and accurate models, and MLClean, a data cleaning framework that also improves fairness and robustness. Finally, responsible AI must be usable where the techniques must be easy to deploy and actionable. We propose FairBatch, a batch selection approach for fairness that is effective and simple to use, and Slice Finder, a model evaluation tool that automatically finds problematic slices. We believe we scratched the surface of responsible AI for end-to-end machine learning and suggest research challenges moving forward.

翻译：负责任的AI正在变得至关重要,因为AI在我们日常生活中被广泛使用。许多使用AI的公司公开表示,当培训一个模型时,我们不仅需要提高它的准确性,而且需要保证该模型不歧视用户(公平性),对吵闹或有毒数据(腐败性)具有弹性,这是可以解释的,而且更多。此外,这些目标不仅与模式培训相关,而且与端对端机器学习的所有步骤相关,包括数据收集、数据清理和验证、模型培训、模型评估以及模型管理和服务。最后,负责任的AI在概念上具有挑战性,并且支持所有目标必须尽可能容易实现。因此,我们为此提出了三个关键研究方向――深度、广度和可用性――来衡量用户(公平性),以衡量进展和介绍我们正在进行的研究。首先,负责任的AI必须得到深入支持,因为必须同时处理公平性和稳健的多个目标。为此,我们建议FR-Train,一个在存在数据偏差和中毒的情况下公平和稳健的模型培训综合框架。我们必须得到广泛支持,最好在机器学习的所有步骤中进行。目前我们建议,在数据前的准确性、可靠、可靠、可靠、可靠、可靠、可靠、可靠地进行Sliar化的模型上,必须建议一个数据处理、可靠、可靠、可靠、可靠、可靠、可靠、可靠、可靠、可靠、可靠、可靠、可靠、可靠、可靠、可靠、可靠、可靠、可靠的模型的模型的模型的模型的系统。