Is tractable tokenization for humans also tractable for machine learning models? This study investigates relations between tractable tokenization for humans (e.g., appropriateness and readability) and one for models of machine learning (e.g., performance on an NLP task). We compared six tokenization methods on the Japanese commonsense question-answering dataset (JCommmonsenseQA in JGLUE). We tokenized question texts of the QA dataset with different tokenizers and compared the performance of human annotators and machine-learning models. Besides,we analyze relationships among the performance, appropriateness of tokenization, and response time to questions. This paper provides a quantitative investigation result that shows the tractable tokenizations for humans and machine learning models are not necessarily the same as each other.
翻译:标题:基于注释的人类和机器学习模型的分词可追溯性研究
摘要:人类可追溯的分词方式是否也适用于机器学习模型?本研究探讨了人类可追溯的分词方式(如适当性和可读性)与机器学习模型的分词可追溯性(例如在自然语言处理任务中的表现)之间的关系。我们在日语常识问答数据集(JGLUE中的JCommmonsenseQA)上比较了六种分词方法。我们使用不同的分词器对QA数据集的问题文本进行分词,并比较了人类标注者和机器学习模型的表现。此外,我们分析了表现、分词适当性和问题答题时间之间的关系。本文提供了一个量化的调查结果,显示人类和机器学习模型的可追溯分词方式并不一定相同。