Online discussion forums are prevalent and easily accessible, thus allowing people to share ideas and opinions by posting messages in the discussion threads. Forum threads that significantly grow in length can become difficult for participants, both newcomers and existing, to grasp main ideas. To mitigate this problem, this study aims to create an automatic text summarizer for online forums. We present Hierarchical Unified Deep Neural Network to build sentence and thread representations for the forum summarization. In this scheme, Bi-LSTM derives a representation that comprises information of the whole sentence and whole thread; whereas, CNN captures most informative features with respect to context from sentence and thread. Attention mechanism is applied on top of CNN to further highlight high-level representations that carry important information contributing to a desirable summary. Extensive performance evaluation has been conducted on three datasets, two of which are real-life online forums and one is news dataset. The results reveal that the proposed model outperforms several competitive baselines.
翻译:在线讨论论坛很普遍,容易进入,这样人们就可以通过在讨论线索上张贴信息来交流想法和意见。论坛线索的长度大增,对于新来者和现有参与者来说,很难掌握主要想法。为缓解这一问题,本研究旨在为在线论坛创建自动文本摘要器。我们介绍等级统一深海神经网络,以构建论坛总结的句子和线索。在这个计划中,Bi-LSTM提供一个包含整个句子和整线信息的代表;而CNN则从句子和线索上捕捉到有关上下文的最丰富的信息特征。在CNN顶端应用关注机制,以进一步突出载有重要信息、有助于编写理想摘要的高级别表述。已经在三个数据集上进行了广泛的绩效评估,其中两个是实时在线论坛,一个是新闻数据集。结果显示,拟议的模型超越了几个竞争性基线。