Community Question Answering (CQA) sites have spread and multiplied significantly in recent years. Sites like Reddit, Quora, and Stack Exchange are becoming popular amongst people interested in finding answers to diverse questions. One practical way of finding such answers is automatically predicting the best candidate given existing answers and comments. Many studies were conducted on answer prediction in CQA but with limited focus on using the background information of the questionnaires. We address this limitation using a novel method for predicting the best answers using the questioner's background information and other features, such as the textual content or the relationships with other participants. Our answer classification model was trained using the Stack Exchange dataset and validated using the Area Under the Curve (AUC) metric. The experimental results show that the proposed method complements previous methods by pointing out the importance of the relationships between users, particularly throughout the level of involvement in different communities on Stack Exchange. Furthermore, we point out that there is little overlap between user-relation information and the information represented by the shallow text features and the meta-features, such as time differences.
翻译:社区问题解答(CQA)站点近年来已经扩大,并成倍增加。Reddit、Quora和Stack Exchange等站点在有兴趣寻找不同问题答案的人中越来越受欢迎。找到这种答案的一个实际办法是自动预测现有答案和评论中的最佳候选人。在CQA中进行了许多关于回答预测的研究,但对于使用问卷的背景资料有限。我们使用一种新颖的方法来预测最佳答案,使用提问者的背景资料和其他特征,例如文字内容或与其他参与者的关系等。我们的答复分类模型是使用Stack Exchange数据集培训的,并使用Curve(AUC)标准下的区域加以验证。实验结果显示,拟议的方法补充了以前的方法,指出用户之间的关系的重要性,特别是在参与Stack Exch Exchange 不同社区的程度。此外,我们指出,用户-关系信息与浅文本特征和元体特征所呈现的信息之间几乎没有重叠,例如时间差异。