Automatic identification of salient aspects from user reviews is especially useful for opinion analysis. There has been significant progress in utilizing weakly supervised approaches, which require only a small set of seed words for training aspect classifiers. However, there is always room for improvement. First, no weakly supervised approaches fully utilize latent hierarchies between words. Second, each seed words representation should have different latent semantics and be distinct when it represents a different aspect. In this paper, we propose HDAE, a hyperbolic disentangled aspect extractor in which a hyperbolic aspect classifier captures words latent hierarchies, and aspect-disentangled representation models the distinct latent semantics of each seed word. Compared to previous baselines, HDAE achieves average F1 performance gains of 18.2% and 24.1% on Amazon product review and restaurant review datasets, respectively. In addition, the em-bedding visualization experience demonstrates that HDAE is a more effective approach to leveraging seed words. An ablation study and a case study further attest to the effectiveness of the proposed components
翻译:对用户审查的突出方面进行自动识别,对于分析意见特别有用。在使用监督不力的方法方面已取得重大进展,这些方法只要求培训方面分类员使用少量的种子词。然而,总是有改进的余地。首先,没有监督不力的方法充分利用言词之间的潜在等级。第二,每个种子字的表示形式应具有不同的潜在语义,当它代表不同的方面时,应具有不同的潜在语义和区别。在本文件中,我们建议HDAE,一个双曲分解的侧面提取器,其中双曲分解的分解器捕捉到潜在等级的词,和侧分解的表述模型,每个种子词的不同潜在语义。与以前的基线相比,HDAE在亚马孙产品审查和餐厅审查数据集方面分别实现了18.2%和24.1%的平均F1性效增益。此外,缩入式直观经验表明,HDAE是利用种子词的一种更有效的方法。一项模拟研究和一项案例研究进一步证明拟议组成部分的有效性。