图形二进制匹配：面向跨语言二进制和源代码匹配的基于图形的相似度学习 (GraphBinMatch: Graph-based Similarity Learning for Cross-Language Binary and Source Code Matching) - 专知论文

会员服务 ·

0

跨语言 · 代码 · 相似度 · 基于图形 · 编程语言 ·

2023 年 4 月 10 日

GraphBinMatch: Graph-based Similarity Learning for Cross-Language Binary and Source Code Matching

翻译：图形二进制匹配：面向跨语言二进制和源代码匹配的基于图形的相似度学习

Ali TehraniJamsaz,Hanze Chen,Ali Jannesari

Matching binary to source code and vice versa has various applications in different fields, such as computer security, software engineering, and reverse engineering. Even though there exist methods that try to match source code with binary code to accelerate the reverse engineering process, most of them are designed to focus on one programming language. However, in real life, programs are developed using different programming languages depending on their requirements. Thus, cross-language binary-to-source code matching has recently gained more attention. Nonetheless, the existing approaches still struggle to have precise predictions due to the inherent difficulties when the problem of matching binary code and source code needs to be addressed across programming languages. In this paper, we address the problem of cross-language binary source code matching. We propose GraphBinMatch, an approach based on a graph neural network that learns the similarity between binary and source codes. We evaluate GraphBinMatch on several tasks, such as cross-language binary-to-source code matching and cross-language source-to-source matching. We also evaluate our approach performance on single-language binary-to-source code matching. Experimental results show that GraphBinMatch outperforms state-of-the-art significantly, with improvements as high as 15% over the F1 score.

翻译：二进制到源代码以及源代码到二进制的匹配在不同领域中都有着广泛的应用，例如计算机安全、软件工程和逆向工程等。虽然现有的方法尝试匹配源代码与二进制代码以加速逆向工程流程，但大多数方法都专注于一个编程语言。然而在实际生活中，程序根据其需求使用不同的编程语言进行开发。因此，跨语言二进制源代码匹配近年来变得越来越受到关注。尽管如此，现有的方法仍然难以进行准确的预测，因为在跨编程语言时匹配二进制代码和源代码时存在内在的困难。在本文中，我们解决了跨语言二进制源代码匹配问题。我们提出了GraphBinMatch，这是一种基于图神经网络的方法，它学习二进制和源代码之间的相似度。我们评估了GraphBinMatch在多个任务上的性能，如跨语言二进制源代码匹配和跨语言源到源匹配。我们还评估了我们的方法在单个语言的二进制源代码匹配上的性能。实验证明，GraphBinMatch的性能显著优于最先进的方法，F1分数的提高高达15%。

0

相关内容

跨语言

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

[ICLR2022]PU learning（Positive and Unlabeled learning）任务的mixup方法

[ICLR2022]PU learning（Positive and Unlabeled learning）任务的mixup方法

专知会员服务

19+阅读 · 2022年2月2日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【KDD2020-清华大学】自适应图编码器，Adaptive Graph Encoder for Attributed Graph Embedding

【KDD2020-清华大学】自适应图编码器，Adaptive Graph Encoder for Attributed Graph Embedding

专知会员服务

99+阅读 · 2020年7月6日

【SIGIR2020】策略感知的无偏排序学习—Top-K排序，Policy-Aware Unbiased Learning to Rank for Top-𝑘 Rankings

【SIGIR2020】策略感知的无偏排序学习—Top-K排序，Policy-Aware Unbiased Learning to Rank for Top-𝑘 Rankings

专知会员服务

27+阅读 · 2020年6月10日

【SIGIR2020】学习词项区分性，Learning Term Discrimination

【SIGIR2020】学习词项区分性，Learning Term Discrimination

专知会员服务

16+阅读 · 2020年4月28日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

近期必读的6篇 NeurIPS 2019 的零样本学习(Zero-Shot Learning)论文

近期必读的6篇 NeurIPS 2019 的零样本学习(Zero-Shot Learning)论文

专知会员服务

60+阅读 · 2019年12月24日

【CIKM 2019论文】从头开始学习识别BC最高节点：一种新的图神经网络方法（Learning to Identify High Betweenness Centrality Nodes from Scratch: A Novel Graph Neural Network Approach）

【CIKM 2019论文】从头开始学习识别BC最高节点：一种新的图神经网络方法（Learning to Identify High Betweenness Centrality Nodes from Scratch: A Novel Graph Neural Network Approach）

专知会员服务

16+阅读 · 2019年11月20日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

代码重构：面向单元测试

代码重构：面向单元测试

阿里技术

0+阅读 · 2022年7月29日

【清华出品】NLP新方向文本对抗攻击与防御必读论文列表

【清华出品】NLP新方向文本对抗攻击与防御必读论文列表

专知

21+阅读 · 2019年7月11日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新六篇图像描述生成相关论文—字符级推断、视觉解释、语义对齐、实体感知、确定性非自回归

【论文推荐】最新六篇图像描述生成相关论文—字符级推断、视觉解释、语义对齐、实体感知、确定性非自回归

专知

15+阅读 · 2018年5月28日

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

专知

12+阅读 · 2018年3月24日

【论文推荐】最新五篇命名实体识别（NER）相关论文—对抗学习、语料库、深度多任务学习、先验知识、跨语言语义

【论文推荐】最新五篇命名实体识别（NER）相关论文—对抗学习、语料库、深度多任务学习、先验知识、跨语言语义

专知

37+阅读 · 2018年2月21日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

面向多源大数据的鲁棒聚类模型与算法研究

国家自然科学基金

6+阅读 · 2015年12月31日

基于局部不变性特征和几何结构相似性的异源遥感影像自动配准

国家自然科学基金

1+阅读 · 2013年12月31日

跨语言信息检索中的机器翻译研究

国家自然科学基金

2+阅读 · 2011年12月31日

高效能自适应处理器体系结构关键技术研究

国家自然科学基金

0+阅读 · 2011年12月31日

面向海量图像高速拷贝检测的视觉指纹提取与匹配

国家自然科学基金

0+阅读 · 2010年12月31日

面向NBTI的SOC芯片可靠性设计关键技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

高精度的跨语言信息检索查询词自动翻译技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

RTCVD结合SLS技术在SiO2隔离层上高速制备大晶粒多晶硅薄膜研究

国家自然科学基金

0+阅读 · 2008年12月31日

跨语言文本自动分类关键技术研究

国家自然科学基金

2+阅读 · 2008年12月31日

基于计算和存储感知的运动估计算法与结构研究

国家自然科学基金

0+阅读 · 2008年12月31日

STaSy: Score-based Tabular data Synthesis

Arxiv

0+阅读 · 2023年5月29日

Visually-augmented pretrained language models for NLP tasks without images

Arxiv

0+阅读 · 2023年5月26日

OpenVIS: Open-vocabulary Video Instance Segmentation

Arxiv

0+阅读 · 2023年5月26日

ReConPatch : Contrastive Patch Representation Learning for Industrial Anomaly Detection

Arxiv

0+阅读 · 2023年5月26日

Beryllium: Neural Search for Algorithm Implementations

Arxiv

0+阅读 · 2023年5月25日

Graph Learning: A Survey

Arxiv

58+阅读 · 2021年5月3日

Self-Supervised Learning of Graph Neural Networks: A Unified Review

Arxiv

38+阅读 · 2021年2月23日

Subgraph Neural Networks

Arxiv

27+阅读 · 2020年6月19日

A Survey of Adversarial Learning on Graphs

Arxiv

38+阅读 · 2020年3月10日

Hierarchical Graph Pooling with Structure Learning

Arxiv

13+阅读 · 2019年11月14日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

[ICLR2022]PU learning（Positive and Unlabeled learning）任务的mixup方法

[ICLR2022]PU learning（Positive and Unlabeled learning）任务的mixup方法

专知会员服务

19+阅读 · 2022年2月2日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【KDD2020-清华大学】自适应图编码器，Adaptive Graph Encoder for Attributed Graph Embedding

【KDD2020-清华大学】自适应图编码器，Adaptive Graph Encoder for Attributed Graph Embedding

专知会员服务

99+阅读 · 2020年7月6日

【SIGIR2020】策略感知的无偏排序学习—Top-K排序，Policy-Aware Unbiased Learning to Rank for Top-𝑘 Rankings

【SIGIR2020】策略感知的无偏排序学习—Top-K排序，Policy-Aware Unbiased Learning to Rank for Top-𝑘 Rankings

专知会员服务

27+阅读 · 2020年6月10日

【SIGIR2020】学习词项区分性，Learning Term Discrimination

【SIGIR2020】学习词项区分性，Learning Term Discrimination

专知会员服务

16+阅读 · 2020年4月28日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

近期必读的6篇 NeurIPS 2019 的零样本学习(Zero-Shot Learning)论文

近期必读的6篇 NeurIPS 2019 的零样本学习(Zero-Shot Learning)论文

专知会员服务

60+阅读 · 2019年12月24日

【CIKM 2019论文】从头开始学习识别BC最高节点：一种新的图神经网络方法（Learning to Identify High Betweenness Centrality Nodes from Scratch: A Novel Graph Neural Network Approach）

【CIKM 2019论文】从头开始学习识别BC最高节点：一种新的图神经网络方法（Learning to Identify High Betweenness Centrality Nodes from Scratch: A Novel Graph Neural Network Approach）

专知会员服务

16+阅读 · 2019年11月20日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

《生态建模密码破译：建模与编程实践》美陆军最新报告

大模型解决方案白皮书：社交陪伴场景全流程落地指南

面向具身操作的视觉-语言-动作模型综述

相关资讯

代码重构：面向单元测试

代码重构：面向单元测试

阿里技术

0+阅读 · 2022年7月29日

【清华出品】NLP新方向文本对抗攻击与防御必读论文列表

【清华出品】NLP新方向文本对抗攻击与防御必读论文列表

专知

21+阅读 · 2019年7月11日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新六篇图像描述生成相关论文—字符级推断、视觉解释、语义对齐、实体感知、确定性非自回归

【论文推荐】最新六篇图像描述生成相关论文—字符级推断、视觉解释、语义对齐、实体感知、确定性非自回归

专知

15+阅读 · 2018年5月28日

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

专知

12+阅读 · 2018年3月24日

【论文推荐】最新五篇命名实体识别（NER）相关论文—对抗学习、语料库、深度多任务学习、先验知识、跨语言语义

【论文推荐】最新五篇命名实体识别（NER）相关论文—对抗学习、语料库、深度多任务学习、先验知识、跨语言语义

专知

37+阅读 · 2018年2月21日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

相关论文

STaSy: Score-based Tabular data Synthesis

Arxiv

0+阅读 · 2023年5月29日

Visually-augmented pretrained language models for NLP tasks without images

Arxiv

0+阅读 · 2023年5月26日

OpenVIS: Open-vocabulary Video Instance Segmentation

Arxiv

0+阅读 · 2023年5月26日

ReConPatch : Contrastive Patch Representation Learning for Industrial Anomaly Detection

Arxiv

0+阅读 · 2023年5月26日

Beryllium: Neural Search for Algorithm Implementations

Arxiv

0+阅读 · 2023年5月25日

Graph Learning: A Survey

Arxiv

58+阅读 · 2021年5月3日

Self-Supervised Learning of Graph Neural Networks: A Unified Review

Arxiv

38+阅读 · 2021年2月23日

Subgraph Neural Networks

Arxiv

27+阅读 · 2020年6月19日

A Survey of Adversarial Learning on Graphs

Arxiv

38+阅读 · 2020年3月10日

Hierarchical Graph Pooling with Structure Learning

Arxiv

13+阅读 · 2019年11月14日

相关基金

面向多源大数据的鲁棒聚类模型与算法研究

国家自然科学基金

6+阅读 · 2015年12月31日

基于局部不变性特征和几何结构相似性的异源遥感影像自动配准

国家自然科学基金

1+阅读 · 2013年12月31日

跨语言信息检索中的机器翻译研究

国家自然科学基金

2+阅读 · 2011年12月31日

高效能自适应处理器体系结构关键技术研究

国家自然科学基金

0+阅读 · 2011年12月31日

面向海量图像高速拷贝检测的视觉指纹提取与匹配

国家自然科学基金

0+阅读 · 2010年12月31日

面向NBTI的SOC芯片可靠性设计关键技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

高精度的跨语言信息检索查询词自动翻译技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

RTCVD结合SLS技术在SiO2隔离层上高速制备大晶粒多晶硅薄膜研究

国家自然科学基金

0+阅读 · 2008年12月31日

跨语言文本自动分类关键技术研究

国家自然科学基金

2+阅读 · 2008年12月31日

基于计算和存储感知的运动估计算法与结构研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员