In this work we explore how language models can be employed to analyze language and discriminate between mentally impaired and healthy subjects through the perplexity metric. Perplexity was originally conceived as an information-theoretic measure to assess how much a given language model is suited to predict a text sequence or, equivalently, how much a word sequence fits into a specific language model. We carried out an extensive experimentation with the publicly available data, and employed language models as diverse as N-grams, from 2-grams to 5-grams, and GPT-2, a transformer-based language model. We investigated whether perplexity scores may be used to discriminate between the transcripts of healthy subjects and subjects suffering from Alzheimer Disease (AD). Our best performing models achieved full accuracy and F-score (1.00 in both precision/specificity and recall/sensitivity) in categorizing subjects from both the AD class and control subjects. These results suggest that perplexity can be a valuable analytical metrics with potential application to supporting early diagnosis of symptoms of mental disorders.
翻译:在这项工作中,我们探索了如何利用语言模型来分析语言,并通过多元度度指标对精神受损者和健康对象进行区分。我们最初认为,多元度是一个信息理论衡量标准,用以评估某种语言模型在多大程度上适合预测文本序列,或者等同地,一个单词序列在多大程度上适合特定语言模型。我们对公开的数据进行了广泛的实验,并使用了从2克到5克的N克等多种语言模型和基于变压器的GPT-2等基于变压器的语言模型。我们调查了是否可使用复度分数来区分健康科目和患有老年痴呆症(AD)的科目。我们的最佳表现模型在对来自AD类和控制科目的科目进行分类时实现了完全准确性和F-分数(1.00分的精确度和回想/敏感度)。这些结果表明,复度可能是可用于支持早期诊断精神失常症状的有价值的分析指标。