Bootstrap aggregating (bagging) is an effective ensemble protocol, which is believed can enhance robustness by its majority voting mechanism. Recent works further prove the sample-wise robustness certificates for certain forms of bagging (e.g. partition aggregation). Beyond these particular forms, in this paper, \emph{we propose the first collective certification for general bagging to compute the tight robustness against the global poisoning attack}. Specifically, we compute the maximum number of simultaneously changed predictions via solving a binary integer linear programming (BILP) problem. Then we analyze the robustness of vanilla bagging and give the upper bound of the tolerable poison budget. Based on this analysis, \emph{we propose hash bagging} to improve the robustness of vanilla bagging almost for free. This is achieved by modifying the random subsampling in vanilla bagging to a hash-based deterministic subsampling, as a way of controlling the influence scope for each poisoning sample universally. Our extensive experiments show the notable advantage in terms of applicability and robustness.
翻译:捆绑集( 捆绑) 是一个有效的组合协议, 据信它能通过多数投票机制增强稳健性。 最近的工作进一步证明了某些包装形式的样本智能强健性证书( 例如分区汇总 ) 。 除这些特定形式外, 本文中我们建议对普通包装进行首次集体认证, 以计算全球中毒袭击的紧固性 。 具体地说, 我们通过解决二元整线性编程( BILP) 问题来计算同时修改的预测的最大数量。 然后我们分析香草包装的稳健性, 并给出可耐性毒预算的上限 。 基于此分析, \ emph{ we 提议 hash baging} 来提高香草包装几乎免费的稳健性。 这是通过修改香草包装中的随机子取样方法实现的, 以此来控制每种中毒样本的影响范围。 我们的广泛实验显示了在适用性和稳健性方面的显著优势 。