Researching bragging behavior on social media arouses interest of computational (socio) linguists. However, existing bragging classification datasets suffer from a serious data imbalance issue. Because labeling a data-balance dataset is expensive, most methods introduce external knowledge to improve model learning. Nevertheless, such methods inevitably introduce noise and non-relevance information from external knowledge. To overcome the drawback, we propose a novel bragging classification method with disentangle-based representation augmentation and domain-aware adversarial strategy. Specifically, model learns to disentangle and reconstruct representation and generate augmented features via disentangle-based representation augmentation. Moreover, domain-aware adversarial strategy aims to constrain domain of augmented features to improve their robustness. Experimental results demonstrate that our method achieves state-of-the-art performance compared to other methods.
翻译:社交媒体上的研究吹嘘行为引起了计算(socio)语言学家的兴趣。然而,现有的吹嘘分类数据集存在严重的数据不平衡问题。由于数据平衡数据集标签费用昂贵,大多数方法都引入了外部知识来改进模型学习。然而,这些方法不可避免地引入来自外部知识的噪音和非相关性信息。为了克服缺陷,我们提出了一种新的吹嘘分类方法,配有分解(socio)语言扩增和域觉对抗策略。具体地说,模型学会分解和重组代表性,并通过分解(dettle-broad)代表扩增产生扩大的特征。此外,有域觉识的对抗战略旨在限制增强功能的领域,以提高其稳健性。实验结果表明,我们的方法与其他方法相比,取得了最先进的性能。