主题上下文向量基于事件不确定性的推特相关性词提取 (Thematic context vector association based on event uncertainty for Twitter) - 专知论文

会员服务 ·

0

上下文向量 · 上下文 · 提取 · 事件 · 不确定 ·

2023 年 4 月 4 日

Thematic context vector association based on event uncertainty for Twitter

翻译：主题上下文向量基于事件不确定性的推特相关性词提取

Vaibhav Khatavkar,Swapnil Mane,Parag Kulkarni

from arxiv, 6 pages

Keyword extraction is a crucial process in text mining. The extraction of keywords with respective contextual events in Twitter data is a big challenge. The challenging issues are mainly because of the informality in the language used. The use of misspelled words, acronyms, and ambiguous terms causes informality. The extraction of keywords with informal language in current systems is pattern based or event based. In this paper, contextual keywords are extracted using thematic events with the help of data association. The thematic context for events is identified using the uncertainty principle in the proposed system. The thematic contexts are weighed with the help of vectors called thematic context vectors which signifies the event as certain or uncertain. The system is tested on the Twitter COVID-19 dataset and proves to be effective. The system extracts event-specific thematic context vectors from the test dataset and ranks them. The extracted thematic context vectors are used for the clustering of contextual thematic vectors which improves the silhouette coefficient by 0.5% than state of art methods namely TF and TF-IDF. The thematic context vector can be used in other applications like Cyberbullying, sarcasm detection, figurative language detection, etc.

翻译：关键词提取是文本挖掘的重要过程。在推特数据中提取具有相应上下文事件的关键词是一项巨大的挑战。挑战主要来自于所使用语言的非正式性。拼写错误的单词、首字母缩略词和含糊不清的术语导致了非正式性问题。当前系统中提取非正式语言中的关键词是基于模式或事件的。本文提出在数据关联的帮助下，使用主题事件提取具有上下文关键词。借助不确定性原理识别事件的主题上下文。使用被称为主题上下文向量的向量衡量主题的确定性或不确定性。在推特COVID-19数据集上测试本系统，证明其有效性。系统从测试数据集中提取特定事件的主题上下文向量并排名。从提取的主题上下文向量用于聚类上下文主题向量，比现有技术方法（如TF和TF-IDF）提高0.5％的轮廓系数。主题上下文向量可用于其他应用程序，如网络欺凌、讽刺检测、比喻语言检测等。

0

相关内容

上下文向量

上下文向量

自然语言处理顶会NAACL2022最佳论文出炉！

自然语言处理顶会NAACL2022最佳论文出炉！

专知会员服务

43+阅读 · 2022年6月30日

【2020关键词提取】基于深度神经网络的关键词提取，Keywords extraction with deep neural network model

【2020关键词提取】基于深度神经网络的关键词提取，Keywords extraction with deep neural network model

专知会员服务

60+阅读 · 2020年5月2日

【2020关键词提取】医学报告的关键词提取和结构化，Keyword extraction and structuralization of medical reports

【2020关键词提取】医学报告的关键词提取和结构化，Keyword extraction and structuralization of medical reports

专知会员服务

33+阅读 · 2020年5月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【WWW2020】学习上下文化文档表示用于医疗答案检索，Learning Contextualized Document Representations for Healthcare Answer Retrieval

【WWW2020】学习上下文化文档表示用于医疗答案检索，Learning Contextualized Document Representations for Healthcare Answer Retrieval

专知会员服务

26+阅读 · 2020年2月10日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

【AAAI2020论文】概念结构化嵌入医疗文本表示（Learning Conceptual-Contextual Embeddings for Medical Text）

【AAAI2020论文】概念结构化嵌入医疗文本表示（Learning Conceptual-Contextual Embeddings for Medical Text）

专知会员服务

49+阅读 · 2019年11月15日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【泡泡汇总】CVPR2019 SLAM Paperlist

【泡泡汇总】CVPR2019 SLAM Paperlist

泡泡机器人SLAM

14+阅读 · 2019年6月12日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

【论文推荐】最新六篇主题模型相关论文—动态主题模型、主题趋势、大规模并行采样、随机采样、非参主题建模

【论文推荐】最新六篇主题模型相关论文—动态主题模型、主题趋势、大规模并行采样、随机采样、非参主题建模

专知

14+阅读 · 2018年6月24日

【论文推荐】最新五篇命名实体识别相关论文—深度主动学习、Lattice LSTM、混合马尔可夫CRF

【论文推荐】最新五篇命名实体识别相关论文—深度主动学习、Lattice LSTM、混合马尔可夫CRF

专知

26+阅读 · 2018年5月22日

笔记 | Sentiment Analysis

笔记 | Sentiment Analysis

黑龙江大学自然语言处理实验室

10+阅读 · 2018年5月6日

LibRec 精选：推荐的可解释性[综述]

LibRec 精选：推荐的可解释性[综述]

LibRec智能推荐

10+阅读 · 2018年5月4日

【论文推荐】最新六篇视觉问答（VQA）相关论文—盲人问题、物体计数、多模态解释、视觉关系、对抗性网络、对偶循环注意力

【论文推荐】最新六篇视觉问答（VQA）相关论文—盲人问题、物体计数、多模态解释、视觉关系、对抗性网络、对偶循环注意力

专知

32+阅读 · 2018年2月28日

【论文推荐】最新六篇自动问答（QA）相关论文—复杂序列问答、注意力机制、长短时记忆、文本推理、多因素注意力、主动的问答智能体

【论文推荐】最新六篇自动问答（QA）相关论文—复杂序列问答、注意力机制、长短时记忆、文本推理、多因素注意力、主动的问答智能体

专知

18+阅读 · 2018年2月22日

【论文推荐】最新5篇情感分析相关论文—深度学习情感分析综述、情感分析语料库、情感预测性、上下文和位置感知的因子分解模型、LSTM

【论文推荐】最新5篇情感分析相关论文—深度学习情感分析综述、情感分析语料库、情感预测性、上下文和位置感知的因子分解模型、LSTM

专知

55+阅读 · 2018年1月28日

基于多模态影像多维度直方图特征的肝硬化结节早期癌变微循环构建研究

国家自然科学基金

0+阅读 · 2016年12月31日

基于多粒度粗糙集的多属性决策方法研究

国家自然科学基金

3+阅读 · 2014年12月31日

柬埔寨语命名实体识别及汉柬双语可比语料库构建方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

miRNAs介导CD147基因3’-UTR多态性对缺血性脑卒中的关联研究

国家自然科学基金

0+阅读 · 2013年12月31日

褐家鼠群体MHC遗传多态性与SEOV感染相关性的研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于二维相关谱掺伪牛奶检测方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

三江中段夏塞银铅锌矿床微量元素富集机制及对成矿过程的指示：硫化物微区分析

国家自然科学基金

0+阅读 · 2012年12月31日

基于hLDA层次主题模型的中文多文档摘要研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于2D视频视觉关注度的3D重建方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

中文医学文本中关联信息提取方法研究

国家自然科学基金

2+阅读 · 2009年12月31日

Do You Hear The People Sing? Key Point Analysis via Iterative Clustering and Abstractive Summarisation

Arxiv

0+阅读 · 2023年5月25日

Failure Detection for Motion Prediction of Autonomous Driving: An Uncertainty Perspective

Arxiv

0+阅读 · 2023年5月25日

Policy Learning based on Deep Koopman Representation

Arxiv

0+阅读 · 2023年5月24日

Topic-Guided Self-Introduction Generation for Social Media Users

Arxiv

0+阅读 · 2023年5月24日

Physics Constrained Motion Prediction with Uncertainty Quantification

Arxiv

0+阅读 · 2023年5月24日

Deep Learning in Multimodal Remote Sensing Data Fusion: A Comprehensive Review

Arxiv

17+阅读 · 2022年5月3日

A Survey of Uncertainty in Deep Neural Networks

Arxiv

30+阅读 · 2021年7月7日

Reasoning in Dialog: Improving Response Generation by Context Reading Comprehension

Arxiv

12+阅读 · 2020年12月14日

A Survey on Deep Learning for Named Entity Recognition

A Survey on Deep Learning for Named Entity Recognition

Arxiv

73+阅读 · 2018年12月22日

Deep contextualized word representations

Arxiv

10+阅读 · 2018年3月22日

VIP会员

文章信息

相关主题

上下文向量

相关VIP内容

自然语言处理顶会NAACL2022最佳论文出炉！

自然语言处理顶会NAACL2022最佳论文出炉！

专知会员服务

43+阅读 · 2022年6月30日

【2020关键词提取】基于深度神经网络的关键词提取，Keywords extraction with deep neural network model

【2020关键词提取】基于深度神经网络的关键词提取，Keywords extraction with deep neural network model

专知会员服务

60+阅读 · 2020年5月2日

【2020关键词提取】医学报告的关键词提取和结构化，Keyword extraction and structuralization of medical reports

【2020关键词提取】医学报告的关键词提取和结构化，Keyword extraction and structuralization of medical reports

专知会员服务

33+阅读 · 2020年5月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【WWW2020】学习上下文化文档表示用于医疗答案检索，Learning Contextualized Document Representations for Healthcare Answer Retrieval

【WWW2020】学习上下文化文档表示用于医疗答案检索，Learning Contextualized Document Representations for Healthcare Answer Retrieval

专知会员服务

26+阅读 · 2020年2月10日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

【AAAI2020论文】概念结构化嵌入医疗文本表示（Learning Conceptual-Contextual Embeddings for Medical Text）

【AAAI2020论文】概念结构化嵌入医疗文本表示（Learning Conceptual-Contextual Embeddings for Medical Text）

专知会员服务

49+阅读 · 2019年11月15日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

用于无人机的C波段空地通信系统研究 | 2025最新116页

甚高频军事战术通信系统传播性能分析研究

军事通信系统：安全行动的支柱

卫星与地面通信系统：美陆军面临的空间与电子战局势 | 39页报告

相关资讯

【泡泡汇总】CVPR2019 SLAM Paperlist

【泡泡汇总】CVPR2019 SLAM Paperlist

泡泡机器人SLAM

14+阅读 · 2019年6月12日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

【论文推荐】最新六篇主题模型相关论文—动态主题模型、主题趋势、大规模并行采样、随机采样、非参主题建模

【论文推荐】最新六篇主题模型相关论文—动态主题模型、主题趋势、大规模并行采样、随机采样、非参主题建模

专知

14+阅读 · 2018年6月24日

【论文推荐】最新五篇命名实体识别相关论文—深度主动学习、Lattice LSTM、混合马尔可夫CRF

【论文推荐】最新五篇命名实体识别相关论文—深度主动学习、Lattice LSTM、混合马尔可夫CRF

专知

26+阅读 · 2018年5月22日

笔记 | Sentiment Analysis

笔记 | Sentiment Analysis

黑龙江大学自然语言处理实验室

10+阅读 · 2018年5月6日

LibRec 精选：推荐的可解释性[综述]

LibRec 精选：推荐的可解释性[综述]

LibRec智能推荐

10+阅读 · 2018年5月4日

【论文推荐】最新六篇视觉问答（VQA）相关论文—盲人问题、物体计数、多模态解释、视觉关系、对抗性网络、对偶循环注意力

【论文推荐】最新六篇视觉问答（VQA）相关论文—盲人问题、物体计数、多模态解释、视觉关系、对抗性网络、对偶循环注意力

专知

32+阅读 · 2018年2月28日

【论文推荐】最新六篇自动问答（QA）相关论文—复杂序列问答、注意力机制、长短时记忆、文本推理、多因素注意力、主动的问答智能体

【论文推荐】最新六篇自动问答（QA）相关论文—复杂序列问答、注意力机制、长短时记忆、文本推理、多因素注意力、主动的问答智能体

专知

18+阅读 · 2018年2月22日

【论文推荐】最新5篇情感分析相关论文—深度学习情感分析综述、情感分析语料库、情感预测性、上下文和位置感知的因子分解模型、LSTM

【论文推荐】最新5篇情感分析相关论文—深度学习情感分析综述、情感分析语料库、情感预测性、上下文和位置感知的因子分解模型、LSTM

专知

55+阅读 · 2018年1月28日

相关论文

Do You Hear The People Sing? Key Point Analysis via Iterative Clustering and Abstractive Summarisation

Arxiv

0+阅读 · 2023年5月25日

Failure Detection for Motion Prediction of Autonomous Driving: An Uncertainty Perspective

Arxiv

0+阅读 · 2023年5月25日

Policy Learning based on Deep Koopman Representation

Arxiv

0+阅读 · 2023年5月24日

Topic-Guided Self-Introduction Generation for Social Media Users

Arxiv

0+阅读 · 2023年5月24日

Physics Constrained Motion Prediction with Uncertainty Quantification

Arxiv

0+阅读 · 2023年5月24日

Deep Learning in Multimodal Remote Sensing Data Fusion: A Comprehensive Review

Arxiv

17+阅读 · 2022年5月3日

A Survey of Uncertainty in Deep Neural Networks

Arxiv

30+阅读 · 2021年7月7日

Reasoning in Dialog: Improving Response Generation by Context Reading Comprehension

Arxiv

12+阅读 · 2020年12月14日

A Survey on Deep Learning for Named Entity Recognition

A Survey on Deep Learning for Named Entity Recognition

Arxiv

73+阅读 · 2018年12月22日

Deep contextualized word representations

Arxiv

10+阅读 · 2018年3月22日

相关基金

基于多模态影像多维度直方图特征的肝硬化结节早期癌变微循环构建研究

国家自然科学基金

0+阅读 · 2016年12月31日

基于多粒度粗糙集的多属性决策方法研究

国家自然科学基金

3+阅读 · 2014年12月31日

柬埔寨语命名实体识别及汉柬双语可比语料库构建方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

miRNAs介导CD147基因3’-UTR多态性对缺血性脑卒中的关联研究

国家自然科学基金

0+阅读 · 2013年12月31日

褐家鼠群体MHC遗传多态性与SEOV感染相关性的研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于二维相关谱掺伪牛奶检测方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

三江中段夏塞银铅锌矿床微量元素富集机制及对成矿过程的指示：硫化物微区分析

国家自然科学基金

0+阅读 · 2012年12月31日

基于hLDA层次主题模型的中文多文档摘要研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于2D视频视觉关注度的3D重建方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

中文医学文本中关联信息提取方法研究

国家自然科学基金

2+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员