Semantic tagging, which has extensive applications in text mining, predicts whether a given piece of text conveys the meaning of a given semantic tag. The problem of semantic tagging is largely solved with supervised learning and today, deep learning models are widely perceived to be better for semantic tagging. However, there is no comprehensive study supporting the popular belief. Practitioners often have to train different types of models for each semantic tagging task to identify the best model. This process is both expensive and inefficient. We embark on a systematic study to investigate the following question: Are deep models the best performing model for all semantic tagging tasks? To answer this question, we compare deep models against "simple models" over datasets with varying characteristics. Specifically, we select three prevalent deep models (i.e. CNN, LSTM, and BERT) and two simple models (i.e. LR and SVM), and compare their performance on the semantic tagging task over 21 datasets. Results show that the size, the label ratio, and the label cleanliness of a dataset significantly impact the quality of semantic tagging. Simple models achieve similar tagging quality to deep models on large datasets, but the runtime of simple models is much shorter. Moreover, simple models can achieve better tagging quality than deep models when targeting datasets show worse label cleanliness and/or more severe imbalance. Based on these findings, our study can systematically guide practitioners in selecting the right learning model for their semantic tagging task.
翻译:语义标记在文本挖掘中具有广泛的应用, 预测某一文本是否传达了给定语义标记的含义。 语义标记的问题大部分通过监管学习解决, 今天, 深层次学习模型被广泛认为对语义标记更好。 但是, 没有全面的研究支持流行的信仰。 执业者往往必须为每个语义标记任务培训不同类型的模型, 以识别最佳模式。 这一过程既昂贵又低效。 我们开始系统研究以下问题: 深层次模型是所有语义标记任务的最佳执行模型 。 要回答这个问题, 我们比较深层次模型与“ 简单模型” 相比, 并且具有不同特性的数据集。 具体地说, 我们选择三种流行的深层次深度模型( CNN、 LSTM 和 BERT) 和两种简单的模型( 即 LR和 SVM ), 并比较它们在21个数据序列的语义标记任务中的语义标记工作表现。 研究结果显示, 大小、 标签比更精确的标签和标签更清洁的数据设置 大大地影响着高层次的模型的质量 。