As children acquire the knowledge of their language's morphology, they invariably discover the productive processes that can generalize to new words. Morphological learning is made challenging by the fact that even fully productive rules have exceptions, as in the well-known case of English past tense verbs, which features the -ed rule against the irregular verbs. The Tolerance Principle is a recent proposal that provides a precise threshold of exceptions that a productive rule can withstand. Its empirical application so far, however, requires the researcher to fully specify rules defined over a set of words. We propose a greedy search model that automatically hypothesizes rules and evaluates their productivity over a vocabulary. When the search for broader productivity fails, the model recursively subdivides the vocabulary and continues the search for productivity over narrower rules. Trained on psychologically realistic data from child-directed input, our model displays developmental patterns observed in child morphology acquisition, including the notoriously complex case of German noun pluralization. It also produces responses to nonce words that, despite receiving only a fraction of the training data, are more similar to those of human subjects than current neural network models' responses are.
翻译:随着儿童掌握了对其语言形态学的知识,他们总是会发现能够推广到新词的生产性过程。即使完全生产性规则也有例外,这给精神学学习带来了挑战,因为即使完全生产性规则也有例外,如众所周知的英国过去时态动词,它具有对抗非正常动词的强化规则的特点。宽容原则是最近的一项提案,它提供了生产规则能够经受的例外的精确阈值。但是,迄今为止,它的实际应用要求研究者对一组词作出充分定义的规则。我们提出了一个贪婪的搜索模型,自动假设规则并用词汇来评估其生产力。当对更广泛的生产力的搜索失败时,该模型会将词汇重新分解,并继续对较狭义规则的生产率进行搜索。我们的模式利用儿童直接输入的心理上现实的数据,展示了在儿童形态学学习中观察到的发展模式,包括臭名昭著的德国名词的多元化案例。它也提出了对非词汇的答复,这些词尽管只收到培训数据的一部分,但与当前神经网络模型的反应更为相似。