Hierarchical models for text classification can leak sensitive or confidential training data information to adversaries due to training data memorization. Using differential privacy during model training can mitigate leakage attacks against trained models by perturbing the training optimizer. However, for hierarchical text classification a multiplicity of model architectures is available and it is unclear whether some architectures yield a better trade-off between remaining model accuracy and model leakage under differentially private training perturbation than others. We use a white-box membership inference attack to assess the information leakage of three widely used neural network architectures for hierarchical text classification under differential privacy. We show that relatively weak differential privacy guarantees already suffice to completely mitigate the membership inference attack, thus resulting only in a moderate decrease in utility. More specifically, for large datasets with long texts we observed transformer-based models to achieve an overall favorable privacy-utility trade-off, while for smaller datasets with shorter texts CNNs are preferable.
翻译:文本分类的等级模式可能会因培训数据记忆而将敏感或保密的培训数据信息泄露给对手。在模型培训期间使用不同的隐私可以通过干扰培训优化者来减少对受过培训的模型的泄漏攻击。但是,对于等级化文本分类而言,有多种模型结构存在,而且尚不清楚某些结构是否在剩余的模型准确性和模型渗漏之间产生更好的权衡,在不同的私人培训扰动下,这些模型的准确性和模型渗漏比其他结构更为有利。我们使用白箱成员推论攻击来评估三种广泛使用的神经网络结构的信息渗漏,用于不同隐私下的等级文本分类。我们表明,相对薄弱的隐私保障已经足以完全减轻成员推论攻击,因此只能导致效用的适度下降。更具体地说,对于长文本的大型数据集,我们所观测的变压器模型是为了实现总体有利的隐私效用交易,而对于使用较短的有CNN的文本的较小数据集则更为可取。