我仍然知道是你! (I still know it's you! On Challenges in Anonymizing Source Code) - 专知论文

会员服务 ·

0

代码 · CLUES · 可辨认的 · Principle · Learning ·

2022 年 8 月 26 日

I still know it's you! On Challenges in Anonymizing Source Code

翻译：我仍然知道是你!

Micha Horlboge,Erwin Quiring,Roland Meyer,Konrad Rieck

The source code of a program not only defines its semantics but also contains subtle clues that can identify its author. Several studies have shown that these clues can be automatically extracted using machine learning and allow for determining a program's author among hundreds of programmers. This attribution poses a significant threat to developers of anti-censorship and privacy-enhancing technologies, as they become identifiable and may be prosecuted. An ideal protection from this threat would be the anonymization of source code. However, neither theoretical nor practical principles of such an anonymization have been explored so far. In this paper, we tackle this problem and develop a framework for reasoning about code anonymization. We prove that the task of generating a $k$-anonymous program -- a program that cannot be attributed to one of $k$ authors -- is not computable and thus a dead end for research. As a remedy, we introduce a relaxed concept called $k$-uncertainty, which enables us to measure the protection of developers. Based on this concept, we empirically study candidate techniques for anonymization, such as code normalization, coding style imitation, and code obfuscation. We find that none of the techniques provides sufficient protection when the attacker is aware of the anonymization. While we introduce an approach for removing remaining clues from the code, the main result of our work is negative: Anonymization of source code is a hard and open problem.

翻译：程序源代码不仅定义其语义学,而且还包含可以识别其作者的微妙线索。几项研究显示, 这些线索可以通过机器学习自动提取, 并允许在数百个程序员中确定一个程序作者。这种属性对反新闻检查和增强隐私技术的开发者构成重大威胁, 因为它们变得可以识别并可能被起诉。理想的保护是源代码的匿名化。但是, 至今尚未探索这种公开匿名的理论或实际原则。在本文件中, 我们处理这一问题, 并开发一个关于编码匿名化的推理框架。我们证明, 生成一个美元匿名程序的任务 -- -- 无法归结为美元作者之一的程序 -- -- 不易懂, 因而是研究的死路。作为补救, 我们引入了一个称为美元- 不确定性的概念, 使我们能够测量对开发者的保护。基于这个概念, 我们实验性地研究匿名化的候选技术, 诸如代码正常化、风格化和代码化的主要代码化方法, 我们没有意识到, 我们的常规化, 我们的常规化的结果是消除了。

0

相关内容

代码（Code）是专知网的一个重要知识资料文档板块，旨在整理收录论文源代码、复现代码，经典工程代码等，便于用户查阅下载使用。

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

调控马铃薯干旱胁迫响应相关转录因子的miRNA功能研究

国家自然科学基金

0+阅读 · 2014年12月31日

番茄果实成熟相关Dicer-like 2c的调控机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

非ABA依赖型SnRK2激酶调控马铃薯响应干旱胁迫的机制解析

国家自然科学基金

0+阅读 · 2014年12月31日

NKAP蛋白质调控染色体稳定性及在卵巢癌发生中的作用研究

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

关于AI-半环簇与 Conway半环簇的研究

国家自然科学基金

1+阅读 · 2012年12月31日

必需调控系统PhoPQ快速起源及与细菌有丝分裂的调控关系

国家自然科学基金

0+阅读 · 2012年12月31日

UGT基因簇进化及调控研究

国家自然科学基金

0+阅读 · 2009年12月31日

南极普里兹湾海冰中氨氧化细菌、氨氧化古菌多样性及其生态功能分析

国家自然科学基金

0+阅读 · 2008年12月31日

一个新的mRNA-like非编码RNA功能研究

国家自然科学基金

0+阅读 · 2008年12月31日

High-efficiency Blockchain-based Supply Chain Traceability

Arxiv

0+阅读 · 2022年10月17日

A.I. Robustness: a Human-Centered Perspective on Technological Challenges and Opportunities

Arxiv

0+阅读 · 2022年10月17日

Attributed Text Generation via Post-hoc Research and Revision

Arxiv

0+阅读 · 2022年10月17日

SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning

Arxiv

0+阅读 · 2022年10月16日

Fair Effect Attribution in Parallel Online Experiments

Arxiv

0+阅读 · 2022年10月15日

Chat Control or Child Protection?

Arxiv

0+阅读 · 2022年10月11日

Everything You wanted to Know about Smart Agriculture

Arxiv

29+阅读 · 2022年1月13日

Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better

Arxiv

28+阅读 · 2021年6月16日

Generative Adversarial Networks in Computer Vision: A Survey and Taxonomy

Generative Adversarial Networks in Computer Vision: A Survey and Taxonomy

Arxiv

42+阅读 · 2020年12月21日

Privacy and Robustness in Federated Learning: Attacks and Defenses

Arxiv

35+阅读 · 2020年12月7日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《运用阵营部署粒子滤波器在部分可观测的陆基军事仿真中追踪敌方部队实体位置》2025最新127页

《基于博弈论学习与控制提升复杂自适应系统的韧性》358页

人工智能能否胜任“金穹”的三分钟窗口战争？

《时间受限环境下的规划：连与排级单位的快速规划方法》

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

High-efficiency Blockchain-based Supply Chain Traceability

Arxiv

0+阅读 · 2022年10月17日

A.I. Robustness: a Human-Centered Perspective on Technological Challenges and Opportunities

Arxiv

0+阅读 · 2022年10月17日

Attributed Text Generation via Post-hoc Research and Revision

Arxiv

0+阅读 · 2022年10月17日

SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning

Arxiv

0+阅读 · 2022年10月16日

Fair Effect Attribution in Parallel Online Experiments

Arxiv

0+阅读 · 2022年10月15日

Chat Control or Child Protection?

Arxiv

0+阅读 · 2022年10月11日

Everything You wanted to Know about Smart Agriculture

Arxiv

29+阅读 · 2022年1月13日

Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better

Arxiv

28+阅读 · 2021年6月16日

Generative Adversarial Networks in Computer Vision: A Survey and Taxonomy

Generative Adversarial Networks in Computer Vision: A Survey and Taxonomy

Arxiv

42+阅读 · 2020年12月21日

Privacy and Robustness in Federated Learning: Attacks and Defenses

Arxiv

35+阅读 · 2020年12月7日

相关基金

调控马铃薯干旱胁迫响应相关转录因子的miRNA功能研究

国家自然科学基金

0+阅读 · 2014年12月31日

番茄果实成熟相关Dicer-like 2c的调控机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

非ABA依赖型SnRK2激酶调控马铃薯响应干旱胁迫的机制解析

国家自然科学基金

0+阅读 · 2014年12月31日

NKAP蛋白质调控染色体稳定性及在卵巢癌发生中的作用研究

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

关于AI-半环簇与 Conway半环簇的研究

国家自然科学基金

1+阅读 · 2012年12月31日

必需调控系统PhoPQ快速起源及与细菌有丝分裂的调控关系

国家自然科学基金

0+阅读 · 2012年12月31日

UGT基因簇进化及调控研究

国家自然科学基金

0+阅读 · 2009年12月31日

南极普里兹湾海冰中氨氧化细菌、氨氧化古菌多样性及其生态功能分析

国家自然科学基金

0+阅读 · 2008年12月31日

一个新的mRNA-like非编码RNA功能研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员