新闻分类数据集 (News Category Dataset) - 专知论文

会员服务 ·

0

INFORMS · 数据集 · Learning · 讲稿 · NLP ·

2022 年 9 月 23 日

News Category Dataset

翻译：新闻分类数据集

People rely on news to know what is happening around the world and inform their daily lives. In today's world, when the proliferation of fake news is rampant, having a large-scale and high-quality source of authentic news articles with the published category information is valuable to learning authentic news' Natural Language syntax and semantics. As part of this work, we present a News Category Dataset that contains around 200k news headlines from the year 2012 to 2018 obtained from HuffPost, along with useful metadata to enable various NLP tasks. In this paper, we also produce some novel insights from the dataset and describe various existing and potential applications of our dataset.

翻译：人们依靠新闻来了解世界各地正在发生的事情,并告知他们的日常生活。在当今世界,当假新闻泛滥时,拥有大量高质量的真实新闻文章来源,并发布分类信息,对于学习真实新闻的自然语言语法和语义很有价值。作为这项工作的一部分,我们推出一个新闻分类数据集,包含2012年至2018年从赫夫波斯特获得的约200公里新闻头条新闻,以及有用的元数据,以完成各种NLP任务。在本文中,我们还从数据集中提供一些新颖的见解,并描述我们数据集的各种现有和潜在应用。

0

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

PAK4介导β-catenin的亚细胞转位调控乳腺癌上皮间质转化的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

Vaspin在胰岛β细胞炎症、胰岛素抵抗及氧化应激中的作用及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

SOX9介导TGF-β/Smads/wnt和β-catenin信号通路调控青少年椎体骺板软骨的分化

国家自然科学基金

0+阅读 · 2014年12月31日

OPG诱导破骨细胞凋亡的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

Catestatin蛋白肽段抑制动脉粥样硬化的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

短周期钙钛矿铁电超晶格压电效应机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

Caveolae和Rho激酶信号传导通路在PPARs调节血管内皮细胞中缝隙连接蛋白的作用

国家自然科学基金

0+阅读 · 2012年12月31日

HDPR1-δ-catenin通路在非小细胞肺癌侵袭和凋亡中的作用机制

国家自然科学基金

0+阅读 · 2012年12月31日

化疗药物诱导大肠癌上皮间质转化过程中PrPc-STAT3通路的作用及机制

国家自然科学基金

0+阅读 · 2011年12月31日

ASPP2调节肝癌细胞上皮间质转化的研究

国家自然科学基金

0+阅读 · 2011年12月31日

A Survey of Historical Document Image Datasets

Arxiv

0+阅读 · 2022年10月31日

Semi-Supervised Domain Generalization for Cardiac Magnetic Resonance Image Segmentation with High Quality Pseudo Labels

Arxiv

0+阅读 · 2022年10月30日

Automatic Discovery and Description of Human Planning Strategies

Arxiv

0+阅读 · 2022年10月28日

Visual Attention Methods in Deep Learning: An In-Depth Survey

Arxiv

44+阅读 · 2022年4月16日

Pre-training Methods in Information Retrieval

Arxiv

16+阅读 · 2021年11月27日

Updating Embeddings for Dynamic Knowledge Graphs

Arxiv

20+阅读 · 2021年9月22日

Graph Self-Supervised Learning: A Survey

Arxiv

15+阅读 · 2021年8月5日

Domain Generalization in Vision: A Survey

Arxiv

16+阅读 · 2021年7月18日

Towards Open World Object Detection

Arxiv

13+阅读 · 2021年3月3日

Mining Dual Emotion for Fake News Detection

Arxiv

13+阅读 · 2020年10月19日

VIP会员

文章信息

相关主题

相关VIP内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《2024年度美国防部作战测试与评估报告》500页

《面相未来作战空中系统中有人-无人编组的AI驱动协作模式选择》含slides

无人机编队飞行：复杂环境中作战的策略、挑战与应用

《探索军事背景下共享大语言模型：AI助手与智能体部署中可扩展性与效率的早期洞察》（含44页slides）

相关资讯

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

A Survey of Historical Document Image Datasets

Arxiv

0+阅读 · 2022年10月31日

Semi-Supervised Domain Generalization for Cardiac Magnetic Resonance Image Segmentation with High Quality Pseudo Labels

Arxiv

0+阅读 · 2022年10月30日

Automatic Discovery and Description of Human Planning Strategies

Arxiv

0+阅读 · 2022年10月28日

Visual Attention Methods in Deep Learning: An In-Depth Survey

Arxiv

44+阅读 · 2022年4月16日

Pre-training Methods in Information Retrieval

Arxiv

16+阅读 · 2021年11月27日

Updating Embeddings for Dynamic Knowledge Graphs

Arxiv

20+阅读 · 2021年9月22日

Graph Self-Supervised Learning: A Survey

Arxiv

15+阅读 · 2021年8月5日

Domain Generalization in Vision: A Survey

Arxiv

16+阅读 · 2021年7月18日

Towards Open World Object Detection

Arxiv

13+阅读 · 2021年3月3日

Mining Dual Emotion for Fake News Detection

Arxiv

13+阅读 · 2020年10月19日

相关基金

PAK4介导β-catenin的亚细胞转位调控乳腺癌上皮间质转化的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

Vaspin在胰岛β细胞炎症、胰岛素抵抗及氧化应激中的作用及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

SOX9介导TGF-β/Smads/wnt和β-catenin信号通路调控青少年椎体骺板软骨的分化

国家自然科学基金

0+阅读 · 2014年12月31日

OPG诱导破骨细胞凋亡的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

Catestatin蛋白肽段抑制动脉粥样硬化的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

短周期钙钛矿铁电超晶格压电效应机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

Caveolae和Rho激酶信号传导通路在PPARs调节血管内皮细胞中缝隙连接蛋白的作用

国家自然科学基金

0+阅读 · 2012年12月31日

HDPR1-δ-catenin通路在非小细胞肺癌侵袭和凋亡中的作用机制

国家自然科学基金

0+阅读 · 2012年12月31日

化疗药物诱导大肠癌上皮间质转化过程中PrPc-STAT3通路的作用及机制

国家自然科学基金

0+阅读 · 2011年12月31日

ASPP2调节肝癌细胞上皮间质转化的研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员