Low-resource languages such as Filipino suffer from data scarcity which makes it challenging to develop NLP applications for Filipino language. The use of Transfer Learning (TL) techniques alleviates this problem in low-resource setting. In recent years, transformer-based models are proven to be effective in low-resource tasks but faces challenges in accessibility due to its high compute and memory requirements. For this reason, there's a need for a cheaper but effective alternative. This paper has three contributions. First, release a pre-trained AWD-LSTM language model for Filipino language. Second, benchmark AWD-LSTM in the Hate Speech classification task and show that it performs on par with transformer-based models. Third, analyze the the performance of AWD-LSTM in low-resource setting using degradation test and compare it with transformer-based models. ----- Ang mga low-resource languages tulad ng Filipino ay gipit sa accessible na datos kaya't mahirap gumawa ng mga applications sa wikang ito. Ang mga Transfer Learning (TL) techniques ay malaking tulong para sa low-resource setting o mga pagkakataong gipit sa datos. Sa mga nagdaang taon, nanaig ang mga transformer-based TL techniques pagdating sa low-resource tasks ngunit ito ay mataas na compute and memory requirements kaya nangangailangan ng mas mura pero epektibong alternatibo. Ang papel na ito ay may tatlong kontribusyon. Una, maglabas ng pre-trained AWD-LSTM language model sa wikang Filipino upang maging tuntungan sa pagbuo ng mga NLP applications sa wikang Filipino. Pangalawa, mag benchmark ng AWD-LSTM sa Hate Speech classification task at ipakita na kayang nitong makipagsabayan sa mga transformer-based models. Pangatlo, suriin ang performance ng AWD-LSTM sa low-resource setting gamit ang degradation test at ikumpara ito sa mga transformer-based models.
翻译:菲律宾等低资源语言缺乏数据,这使得开发菲律宾语言的NLP应用程序成为挑战。 使用传输学习技术在低资源环境下缓解了这个问题。 近年来, 以变压器为基础的模型在低资源任务中证明是有效的, 但是由于高的计算和记忆要求, 在获取方面面临着挑战。 为此, 需要一个更便宜但有效的替代品。 本文有三项贡献 。 首先, 为菲律宾语言发布预先训练的 AWD- LSTM语言模型。 第二, 在仇恨言论分类任务中以AWD- LSTM为基准, 显示它与以变压为基础的模型为同级。 第三, 利用退化测试分析低资源环境中的AWD- LSTM的绩效, 并比较基于变压模型的A- ANGD 低资源语言 tald ng 。 以纳达马达马萨基亚马基亚马基亚马基亚马基马基马基马基马基亚马基亚马基亚马基亚马基亚马基亚马基亚马基亚马基亚马基亚马基亚马基亚马基亚马基亚马基亚马基马基马基马基马基奥 应用应用软件。 。 应用程序应用程序应用程序应用程序应用程序应用 。 安亚马基亚马基亚马基马基马基马基马基马基马基马基亚马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马基马