对非裔美裔英语的氨光磷同声学特征的模糊性 -- -- 惯常情况就是如此 (Disambiguation of morpho-syntactic features of African American English -- the case of habitual be) - 专知论文

会员服务 ·

0

对抗自编码 · 有偏 · CASE · 讲稿 · 无偏 ·

2022 年 4 月 26 日

Disambiguation of morpho-syntactic features of African American English -- the case of habitual be

翻译：对非裔美裔英语的氨光磷同声学特征的模糊性 -- -- 惯常情况就是如此

Harrison Santiago,Joshua Martin,Sarah Moeller,Kevin Tang

Recent research has highlighted that natural language processing (NLP) systems exhibit a bias against African American speakers. The bias errors are often caused by poor representation of linguistic features unique to African American English (AAE), due to the relatively low probability of occurrence of many such features in training data. We present a workflow to overcome such bias in the case of habitual "be". Habitual "be" is isomorphic, and therefore ambiguous, with other forms of "be" found in both AAE and other varieties of English. This creates a clear challenge for bias in NLP technologies. To overcome the scarcity, we employ a combination of rule-based filters and data augmentation that generate a corpus balanced between habitual and non-habitual instances. With this balanced corpus, we train unbiased machine learning classifiers, as demonstrated on a corpus of AAE transcribed texts, achieving .65 F$_1$ score disambiguating habitual "be".

翻译：最近的研究突出表明,自然语言处理系统(NLP)对讲非洲语言的人有偏见,偏差错误往往是由于在培训数据中出现许多这类特征的概率相对较低而造成,因为非裔美国人英语语言特征的描述不甚清晰,我们提出了一个工作流程,以克服习惯“be”中的这种偏差。习惯“be”是无定型的,因此模糊不清,在AAE和其他类型的英语中都发现了其他形式的“be”。这给在NLP技术中的偏差带来了明显的挑战。为了克服这种偏差,我们采用了基于规则的过滤器和数据增强相结合的办法,在习惯和非习惯实例之间形成了一种平衡的主体。有了这种平衡,我们培训了不带偏见的机器学习分类人员,如AE转录的文本文集所显示的那样,我们培训了无偏见的机器学习分类人员,达到65F$_1美元分的比分不协调的习惯“be”。

0

相关内容

对抗自编码

对抗自编码

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

准一维自旋链与自旋梯子共存Sr14Cu24O41体系电荷有序行为的研究

国家自然科学基金

0+阅读 · 2013年12月31日

适配体靶向的纳米载体的钆基磁共振成像造影剂的构建研究

国家自然科学基金

0+阅读 · 2013年12月31日

丹参联合化疗和VEGF靶向药物对结肠癌的协同作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

含极性非质子溶剂的离子液体-kosmotropic盐双水相体系的研究

国家自然科学基金

0+阅读 · 2012年12月31日

SPECT-CT引导体外控释多功能金纳米胶囊治疗晚期前列腺癌的实验研究

国家自然科学基金

0+阅读 · 2012年12月31日

MOF/CNT/CTA表界面结构调控及复杂气体吸附机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

癌症的靶向基因 - 痘苗溶瘤病毒治疗策略

国家自然科学基金

1+阅读 · 2012年12月31日

多功能金纳米棒诊疗系统在肿瘤靶向与光热治疗以及肿瘤成像中的研究

国家自然科学基金

0+阅读 · 2012年12月31日

淀粉基核壳结构磁性炭材料的合成、结构和催化性能研究

国家自然科学基金

0+阅读 · 2009年12月31日

SOCS-1对糖尿病肾病的影响及机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

Causal Discovery for Fairness

Arxiv

0+阅读 · 2022年6月14日

On the reusability of samples in active learning

Arxiv

0+阅读 · 2022年6月13日

Specifying and Testing $k$-Safety Properties for Machine-Learning Models

Arxiv

0+阅读 · 2022年6月13日

Introducing the diagrammatic semiotic mode

Arxiv

0+阅读 · 2022年6月12日

Explaining Image Classifiers Using Contrastive Counterfactuals in Generative Latent Spaces

Arxiv

0+阅读 · 2022年6月10日

Domain Transformer: Predicting Samples of Unseen, Future Domains

Domain Transformer: Predicting Samples of Unseen, Future Domains

Arxiv

0+阅读 · 2022年6月10日

Ask to Know More: Generating Counterfactual Explanations for Fake Claims

Arxiv

0+阅读 · 2022年6月10日

Neural Architecture Search without Training

Neural Architecture Search without Training

Arxiv

10+阅读 · 2021年6月11日

Counterfactual Explanations for Machine Learning: A Review

Arxiv

25+阅读 · 2020年10月20日

Multimodal Categorization of Crisis Events in Social Media

Multimodal Categorization of Crisis Events in Social Media

Arxiv

20+阅读 · 2020年4月10日

VIP会员

文章信息

相关主题

对抗自编码

相关VIP内容

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《军事域人工智能风险、机遇与治理战略指导报告》2025最新76页报告

《杀伤网与精确规模：智能饱和战争时代的战略要务-印度视角》2025最新报告

俄乌冲突的地缘政治与军事教训（万字长文）

《弹药快速效能建模：推进互操作性与技术优势》2025最新26页报告

相关资讯

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Causal Discovery for Fairness

Arxiv

0+阅读 · 2022年6月14日

On the reusability of samples in active learning

Arxiv

0+阅读 · 2022年6月13日

Specifying and Testing $k$-Safety Properties for Machine-Learning Models

Arxiv

0+阅读 · 2022年6月13日

Introducing the diagrammatic semiotic mode

Arxiv

0+阅读 · 2022年6月12日

Explaining Image Classifiers Using Contrastive Counterfactuals in Generative Latent Spaces

Arxiv

0+阅读 · 2022年6月10日

Domain Transformer: Predicting Samples of Unseen, Future Domains

Domain Transformer: Predicting Samples of Unseen, Future Domains

Arxiv

0+阅读 · 2022年6月10日

Ask to Know More: Generating Counterfactual Explanations for Fake Claims

Arxiv

0+阅读 · 2022年6月10日

Neural Architecture Search without Training

Neural Architecture Search without Training

Arxiv

10+阅读 · 2021年6月11日

Counterfactual Explanations for Machine Learning: A Review

Arxiv

25+阅读 · 2020年10月20日

Multimodal Categorization of Crisis Events in Social Media

Multimodal Categorization of Crisis Events in Social Media

Arxiv

20+阅读 · 2020年4月10日

相关基金

准一维自旋链与自旋梯子共存Sr14Cu24O41体系电荷有序行为的研究

国家自然科学基金

0+阅读 · 2013年12月31日

适配体靶向的纳米载体的钆基磁共振成像造影剂的构建研究

国家自然科学基金

0+阅读 · 2013年12月31日

丹参联合化疗和VEGF靶向药物对结肠癌的协同作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

含极性非质子溶剂的离子液体-kosmotropic盐双水相体系的研究

国家自然科学基金

0+阅读 · 2012年12月31日

SPECT-CT引导体外控释多功能金纳米胶囊治疗晚期前列腺癌的实验研究

国家自然科学基金

0+阅读 · 2012年12月31日

MOF/CNT/CTA表界面结构调控及复杂气体吸附机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

癌症的靶向基因 - 痘苗溶瘤病毒治疗策略

国家自然科学基金

1+阅读 · 2012年12月31日

多功能金纳米棒诊疗系统在肿瘤靶向与光热治疗以及肿瘤成像中的研究

国家自然科学基金

0+阅读 · 2012年12月31日

淀粉基核壳结构磁性炭材料的合成、结构和催化性能研究

国家自然科学基金

0+阅读 · 2009年12月31日

SOCS-1对糖尿病肾病的影响及机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员