编码混合NLP:用于编码混合的扩展和开放 NLP工具包 (CodemixedNLP: An Extensible and Open NLP Toolkit for Code-Mixing) - 专知论文

会员服务 ·

0

Extensibility · NLP · MoDELS · INTERACT · state-of-the-art ·

2021 年 6 月 10 日

CodemixedNLP: An Extensible and Open NLP Toolkit for Code-Mixing

翻译：编码混合NLP:用于编码混合的扩展和开放 NLP工具包

Sai Muralidhar Jayanthi,Kavya Nerella,Khyathi Raghavi Chandu,Alan W Black

from arxiv, Accepted at the Fifth Workshop on Computational Approaches to Linguistic Code-Switching-CALCS 2021

The NLP community has witnessed steep progress in a variety of tasks across the realms of monolingual and multilingual language processing recently. These successes, in conjunction with the proliferating mixed language interactions on social media have boosted interest in modeling code-mixed texts. In this work, we present CodemixedNLP, an open-source library with the goals of bringing together the advances in code-mixed NLP and opening it up to a wider machine learning community. The library consists of tools to develop and benchmark versatile model architectures that are tailored for mixed texts, methods to expand training sets, techniques to quantify mixing styles, and fine-tuned state-of-the-art models for 7 tasks in Hinglish. We believe this work has a potential to foster a distributed yet collaborative and sustainable ecosystem in an otherwise dispersed space of code-mixing research. The toolkit is designed to be simple, easily extensible, and resourceful to both researchers as well as practitioners.

翻译：最近,在单一语言和多语言处理领域的各种任务方面,国家语言方案社区取得了巨大进展,这些成功,加上社交媒体上日益扩大的混合语言互动,提高了对代码混合文本建模的兴趣。在这项工作中,我们展示了代码混合NLP(一个开放源库),这是一个开放源库,目的是将代码混合国家语言方案的进展汇集在一起,并将其开放给更广泛的机器学习界。图书馆由各种工具组成,用于开发和基准多功能模型结构,这些结构是专门为混合文本设计的,扩大培训套件的方法,对混合样式进行量化的技术,以及对Hingish的7项任务进行精细调整的最新模型。我们认为,这项工作有可能在一个原本分散的代码混合研究空间中促进分布的、合作和可持续的生态系统。工具包的设计简单、易于推广,对研究人员和从业人员都是资源丰富的。

0

相关内容

Extensibility

iOS 8 提供的应用间和应用跟系统的功能交互特性。

Today (iOS and OS X): widgets for the Today view of Notification Center
Share (iOS and OS X): post content to web services or share content with others
Actions (iOS and OS X): app extensions to view or manipulate inside another app
Photo Editing (iOS): edit a photo or video in Apple's Photos app with extensions from a third-party apps
Finder Sync (OS X): remote file storage in the Finder with support for Finder content annotation
Storage Provider (iOS): an interface between files inside an app and other apps on a user's device
Custom Keyboard (iOS): system-wide alternative keyboards

Source: iOS 8 Extensions: Apple’s Plan for a Powerful App Ecosystem

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

专知会员服务

96+阅读 · 2020年4月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【课程】Andrew Ng与Google Brain团队联合出品《TensorFlow in Practice 》

【课程】Andrew Ng与Google Brain团队联合出品《TensorFlow in Practice 》

专知会员服务

13+阅读 · 2019年10月29日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Facebook PyText 在 Github 上开源了

Facebook PyText 在 Github 上开源了

AINLP

7+阅读 · 2018年12月14日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

A Survey of Transformers

Arxiv

103+阅读 · 2021年6月8日

An Attentive Survey of Attention Models

An Attentive Survey of Attention Models

Arxiv

44+阅读 · 2020年12月15日

Pre-trained Models for Natural Language Processing: A Survey

Arxiv

113+阅读 · 2020年3月18日

Commonsense Reasoning for Natural Language Understanding: A Survey of Benchmarks, Resources, and Approaches

Arxiv

16+阅读 · 2019年4月2日

XNMT: The eXtensible Neural Machine Translation Toolkit

Arxiv

3+阅读 · 2018年3月1日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

专知会员服务

96+阅读 · 2020年4月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【课程】Andrew Ng与Google Brain团队联合出品《TensorFlow in Practice 》

【课程】Andrew Ng与Google Brain团队联合出品《TensorFlow in Practice 》

专知会员服务

13+阅读 · 2019年10月29日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

人机协同时代的军事指挥控制演进

《英国智库：瓦解俄罗斯防空系统生产，夺回制空权》最新报告

《通过仿真与开源数据提升战略决策：机遇与局限》最新报告

《战术突击工具包：军队的“边缘”操作系统》报告

相关资讯

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Facebook PyText 在 Github 上开源了

Facebook PyText 在 Github 上开源了

AINLP

7+阅读 · 2018年12月14日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

相关论文

A Survey of Transformers

Arxiv

103+阅读 · 2021年6月8日

An Attentive Survey of Attention Models

An Attentive Survey of Attention Models

Arxiv

44+阅读 · 2020年12月15日

Pre-trained Models for Natural Language Processing: A Survey

Arxiv

113+阅读 · 2020年3月18日

Commonsense Reasoning for Natural Language Understanding: A Survey of Benchmarks, Resources, and Approaches

Arxiv

16+阅读 · 2019年4月2日

XNMT: The eXtensible Neural Machine Translation Toolkit

Arxiv

3+阅读 · 2018年3月1日

微信扫码咨询专知VIP会员