We study the underexplored problem of Continual Multilingual Learning, where a multilingual model, already trained on task-specific data from all supported languages, is continually updated using batches of new multilingual training data for the same task. We show that naively updating the multilingual model can lead to losses in performance over a subset of languages although the aggregated performance metric shows an improvement. We establish this phenomenon over four tasks belonging to three task families (token-level, sentence-level and seq2seq). We then build upon recent advances in parameter-efficient finetuning to develop novel finetuning strategies that allow us to jointly minimize language-specific forgetting while encouraging positive cross-lingual transfer observed in this setup. Our proposed pipeline, LAFT-URIEL, improves the spread of gains over the supported languages while reducing the magnitude of language-specific losses incurred.
翻译:我们研究未得到充分探讨的多语文持续学习问题,即已经接受过来自所有辅助语文的具体任务数据培训的多语文模式,正在利用一系列新的多语文培训数据不断更新,用于同一任务;我们表明,对多语文模式进行天真地更新,可能会导致对一组语文的性能损失,尽管综合性能指标显示情况有所改善;我们将这种现象确定为属于三个任务组的四项任务(一级、判决一级和后续一级);然后,我们利用最近在参数效率微调方面取得的进展,制定新的微调战略,使我们能够共同尽量减少特定语文的忘却,同时鼓励在这一设置中观察到积极的跨语文转移;我们提议的编审方案LAFT-URIEL改进了在所支助语文上所获收益的分布,同时减少了特定语文损失的程度。