Proposing scoring functions to effectively understand, analyze and learn various properties of high dimensional hidden representations of large-scale transformer models like BERT can be a challenging task. In this work, we explore a new direction by studying the topological features of BERT hidden representations using persistent homology (PH). We propose a novel scoring function named "persistence scoring function (PSF)" which: (i) accurately captures the homology of the high-dimensional hidden representations and correlates well with the test set accuracy of a wide range of datasets and outperforms existing scoring metrics, (ii) captures interesting post fine-tuning "per-class" level properties from both qualitative and quantitative viewpoints, (iii) is more stable to perturbations as compared to the baseline functions, which makes it a very robust proxy, and (iv) finally, also serves as a predictor of the attack success rates for a wide category of black-box and white-box adversarial attack methods. Our extensive correlation experiments demonstrate the practical utility of PSF on various NLP tasks relevant to BERT.
翻译:在这项工作中,我们探索了一个新的方向,方法是利用持久性同质学(PH)研究BERT隐藏的表示的地形特征。我们提议了一个叫作“持久性评分功能(PSF)”的新型评分功能,该功能:(一) 准确地记录了高维隐蔽表示的同义性,并与一系列广泛的数据集的测试数据集的精确度和现有评分指标的超常性相适应;(二) 从质和量两个角度上采集了有趣的“单级”级微调后性能,(三) 与基线功能相比,更稳定地进行扰动,因此它是一个非常有力的代用,以及(四) 最后,它还作为广泛类别的黑箱和白箱对抗性攻击方法攻击成功率的预测器。我们的广泛的相关实验表明PSF在与BERT相关的各种NLP任务上的实际效用。