Several AutoML approaches have been proposed to automate the machine learning (ML) process, such as searching for the ML model architectures and hyper-parameters. However, these AutoML pipelines only focus on improving the learning accuracy of benign samples while ignoring the ML model robustness under adversarial attacks. As ML systems are increasingly being used in a variety of mission-critical applications, improving the robustness of ML systems has become of utmost importance. In this paper, we propose the first robust AutoML framework, Robusta--based on reinforcement learning (RL)--to perform feature selection, aiming to select features that lead to both accurate and robust ML systems. We show that a variation of the 0-1 robust loss can be directly optimized via an RL-based combinatorial search in the feature selection scenario. In addition, we employ heuristics to accelerate the search procedure based on feature scoring metrics, which are mutual information scores, tree-based classifiers feature importance scores, F scores, and Integrated Gradient (IG) scores, as well as their combinations. We conduct extensive experiments and show that the proposed framework is able to improve the model robustness by up to 22% while maintaining competitive accuracy on benign samples compared with other feature selection methods.
翻译:已经提出若干自动解运办法,将机器学习(ML)进程自动化,例如寻找ML模型架构和超参数;然而,这些自动解运管道只注重提高良性样品的学习准确性,而忽视对抗性攻击下的ML模型稳健性;随着ML系统越来越多地用于各种任务关键应用,提高ML系统的稳健性已变得极为重要;在本文件中,我们提议第一个强有力的自动解运框架,即基于强化学习(RL)的Robusta-基于强化学习(RL)进行特征选择,目的是选择导致精确和稳健的ML系统的特征。我们表明,在特征选择设想中,可以通过基于RL的组合搜索直接优化0-1强性损失的变异性。此外,我们采用超自然学来加快基于特征评分的搜索程序,即共同信息评分、基于树基的分类特征评分重要分、F分和综合梯级(IG)的评分及其组合。我们进行了广泛的实验,并表明,通过基于RIL的组合,将稳性框架与22项进行比较,同时显示,以稳性地标能改进了其他选择方法。