利用有效语言分析语言模式 (Exploiting Language Model for Efficient Linguistic Steganalysis) - 专知论文

会员服务 ·

0

语言模型化 · Performer · MoDELS · RNN · 条件概率分布 ·

2022 年 2 月 2 日

Exploiting Language Model for Efficient Linguistic Steganalysis

翻译：利用有效语言分析语言模式

Biao Yi,Hanzhou Wu,Guorui Feng,Xinpeng Zhang

from arxiv, Accepted to IEEE International Conference on Acoustics, Speech, and Signal Processing 2022

Recent advances in linguistic steganalysis have successively applied CNN, RNN, GNN and other efficient deep models for detecting secret information in generative texts. These methods tend to seek stronger feature extractors to achieve higher steganalysis effects. However, we have found through experiments that there actually exists significant difference between automatically generated stego texts and carrier texts in terms of the conditional probability distribution of individual words. Such kind of difference can be naturally captured by the language model used for generating stego texts. Through further experiments, we conclude that this ability can be transplanted to a text classifier by pre-training and fine-tuning to improve the detection performance. Motivated by this insight, we propose two methods for efficient linguistic steganalysis. One is to pre-train a language model based on RNN, and the other is to pre-train a sequence autoencoder. The results indicate that the two methods have different degrees of performance gain compared to the randomly initialized RNN, and the convergence speed is significantly accelerated. Moreover, our methods achieved the best performance compared to related works, while providing a solution for real-world scenario where there are more cover texts than stego texts.

翻译：在语言学分析方面最近的进展相继应用了CNN、RNN、GNN和其他高效的深度模型,以探测基因化文本中的机密信息,这些方法往往寻求更强的特征提取器,以达到更高的分解效果。然而,我们通过实验发现,在单词的有条件概率分布方面,自动生成的stego文本和承运人文本之间实际上存在很大差异。这种差异可以自然地通过生成stego文本所使用的语言模型来捕捉。通过进一步试验,我们得出结论,这种能力可以通过培训前和微调移植到文本分类器上,以提高探测性能。我们受这一洞察的启发,我们提出了两种高效的语言分解方法。一种是预先开发基于RNN的语文模型,另一种是预先配置一个自动编码的顺序。结果显示,这两种方法的性能收益与随机初始化的RNNN值不同,而趋同速度则大大加快。此外,我们的方法与相关作品相比,取得了最佳的性能,同时为现实世界情景提供了一种解决办法,因为其中的文本比文本覆盖得更多。

0

相关内容

语言模型化

语言模型化

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

磁驱动形状记忆合金结构转变与磁转变耦合及其与功能特性的关联性研究

国家自然科学基金

0+阅读 · 2014年12月31日

基底型乳腺癌干细胞信号传导网络结构建模

国家自然科学基金

0+阅读 · 2014年12月31日

应用膜蛋白纳米组装研究EGFR/HER2过表达致癌的分子机理与结构

国家自然科学基金

0+阅读 · 2014年12月31日

面向智能电网基础设施Cyber-Physical安全的自治愈基础理论研究

国家自然科学基金

1+阅读 · 2013年12月31日

基于卟啉-大环受体自组装表面的重金属离子传感器

国家自然科学基金

0+阅读 · 2013年12月31日

有效节省电子学的MICROMEGAS位置编码读出研究

国家自然科学基金

0+阅读 · 2012年12月31日

帕金森疾病相关蛋白质相互作用网络研究

国家自然科学基金

1+阅读 · 2012年12月31日

Ti2AlC基材料合成热力学及高温稳定性研究

国家自然科学基金

0+阅读 · 2011年12月31日

序列密码的密钥流稳定性研究

国家自然科学基金

0+阅读 · 2009年12月31日

移动网格中基于能量优化的资源管理理论及方法的研究

国家自然科学基金

0+阅读 · 2009年12月31日

Effects of Graph Convolutions in Deep Networks

Arxiv

0+阅读 · 2022年4月20日

Impact of Tokenization on Language Models: An Analysis for Turkish

Arxiv

0+阅读 · 2022年4月19日

Contrastive Demonstration Tuning for Pre-trained Language Models

Arxiv

0+阅读 · 2022年4月18日

METRO: Efficient Denoising Pretraining of Large Scale Autoencoding Language Models with Model Generated Signals

Arxiv

0+阅读 · 2022年4月16日

Hierarchical Transformers Are More Efficient Language Models

Arxiv

3+阅读 · 2022年4月16日

Graph-incorporated Latent Factor Analysis for High-dimensional and Sparse Matrices

Arxiv

0+阅读 · 2022年4月16日

Efficient Architecture Search for Diverse Tasks

Arxiv

0+阅读 · 2022年4月15日

Barwise Compression Schemes for Audio-Based Music Structure Analysis

Barwise Compression Schemes for Audio-Based Music Structure Analysis

Arxiv

0+阅读 · 2022年4月15日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

Latent Relation Language Models

Arxiv

21+阅读 · 2019年8月21日

VIP会员

文章信息

相关主题

语言模型化

条件概率分布

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【ICML2025】QuRe：通过困难负样本采样实现查询相关的组合图像检索

自动驾驶中的3D目标检测研究进展

中文版 | 无人机战争与乌克兰战场演进（2024-2025）

【阿姆斯特丹博士论文】在嘈杂和低资源环境中提升神经检索器的鲁棒性与有效性

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Effects of Graph Convolutions in Deep Networks

Arxiv

0+阅读 · 2022年4月20日

Impact of Tokenization on Language Models: An Analysis for Turkish

Arxiv

0+阅读 · 2022年4月19日

Contrastive Demonstration Tuning for Pre-trained Language Models

Arxiv

0+阅读 · 2022年4月18日

METRO: Efficient Denoising Pretraining of Large Scale Autoencoding Language Models with Model Generated Signals

Arxiv

0+阅读 · 2022年4月16日

Hierarchical Transformers Are More Efficient Language Models

Arxiv

3+阅读 · 2022年4月16日

Graph-incorporated Latent Factor Analysis for High-dimensional and Sparse Matrices

Arxiv

0+阅读 · 2022年4月16日

Efficient Architecture Search for Diverse Tasks

Arxiv

0+阅读 · 2022年4月15日

Barwise Compression Schemes for Audio-Based Music Structure Analysis

Barwise Compression Schemes for Audio-Based Music Structure Analysis

Arxiv

0+阅读 · 2022年4月15日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

Latent Relation Language Models

Arxiv

21+阅读 · 2019年8月21日

相关基金

磁驱动形状记忆合金结构转变与磁转变耦合及其与功能特性的关联性研究

国家自然科学基金

0+阅读 · 2014年12月31日

基底型乳腺癌干细胞信号传导网络结构建模

国家自然科学基金

0+阅读 · 2014年12月31日

应用膜蛋白纳米组装研究EGFR/HER2过表达致癌的分子机理与结构

国家自然科学基金

0+阅读 · 2014年12月31日

面向智能电网基础设施Cyber-Physical安全的自治愈基础理论研究

国家自然科学基金

1+阅读 · 2013年12月31日

基于卟啉-大环受体自组装表面的重金属离子传感器

国家自然科学基金

0+阅读 · 2013年12月31日

有效节省电子学的MICROMEGAS位置编码读出研究

国家自然科学基金

0+阅读 · 2012年12月31日

帕金森疾病相关蛋白质相互作用网络研究

国家自然科学基金

1+阅读 · 2012年12月31日

Ti2AlC基材料合成热力学及高温稳定性研究

国家自然科学基金

0+阅读 · 2011年12月31日

序列密码的密钥流稳定性研究

国家自然科学基金

0+阅读 · 2009年12月31日

移动网格中基于能量优化的资源管理理论及方法的研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员