评估关于git 承诺电文的机器人检测模型 (Evaluating a bot detection model on git commit messages) - 专知论文

会员服务 ·

0

Git · MoDELS · 分类模型 · 查准率/准确率 · ReQuEST ·

2021 年 3 月 22 日

Evaluating a bot detection model on git commit messages

翻译：评估关于git 承诺电文的机器人检测模型

Mehdi Golzadeh,Alexandre Decan,Tom Mens

from arxiv, 4 pages, 1 pages of references

Detecting the presence of bots in distributed software development activity is very important in order to prevent bias in large-scale socio-technical empirical analyses. In previous work, we proposed a classification model to detect bots in GitHub repositories based on the pull request and issue comments of GitHub accounts. The current study generalises the approach to git contributors based on their commit messages. We train and evaluate the classification model on a large dataset of 6,922 git contributors. The original model based on pull request and issue comments obtained a precision of 0.77 on this dataset. Retraining the classification model on git commit messages increased the precision to 0.80. As a proof-of-concept, we implemented this model in BoDeGiC, an open source command-line tool to detect bots in git repositories.

翻译：检测分布式软件开发活动中存在机器人的存在非常重要,以防止大规模社会技术经验分析中的偏差。在以往的工作中,我们根据拉动请求提出了一个分类模型,以探测GitHub仓库中的机器人,并发表GitHub账户的评论。当前研究概括了基于其承诺信息对投稿者采用的方法。我们培训和评价了6 922 git提供方的大型数据集的分类模型。基于拉动请求和发布评论的原始模型在这个数据集上获得了0.77的精确度。对Git承诺信息分类模型的再培训将精确度提高到0.80。作为概念的证明,我们在BoDeGic应用了这一模型,这是一个用于检测Git储存库中的机器人的开放源指令-线工具。

0

相关内容

Git

Git 是一个为了更好地管理 Linux 内核开发而创立的分布式版本控制和软件配置管理软件。国内外知名 Git 代码托管网站有： http://GitHub.com http://Coding.net http://code.csdn.net ...

【MIT干货书】机器学习算法视角，126页pdf

【MIT干货书】机器学习算法视角，126页pdf

专知会员服务

78+阅读 · 2021年1月25日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【深度伪造综述论文】The Creation and Detection of Deepfakes: A Survey

【深度伪造综述论文】The Creation and Detection of Deepfakes: A Survey

专知会员服务

55+阅读 · 2020年4月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习相关资源(框架、库、软件)大列表

机器学习相关资源(框架、库、软件)大列表

专知会员服务

40+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

异常检测（Anomaly Detection）综述

异常检测（Anomaly Detection）综述

极市平台

20+阅读 · 2020年10月24日

AI可解释性文献列表

AI可解释性文献列表

专知

42+阅读 · 2019年10月7日

【Github】All4NLP：自然语言处理相关资源整理

【Github】All4NLP：自然语言处理相关资源整理

AINLP

23+阅读 · 2019年8月9日

已删除

将门创投

5+阅读 · 2019年6月28日

CCF A类 | 顶级会议RTSS 2019诚邀稿件

CCF A类 | 顶级会议RTSS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年4月17日

TCN v2 + 3Dconv 运动信息

TCN v2 + 3Dconv 运动信息

CreateAMind

4+阅读 · 2019年1月8日

spinningup.openai 强化学习资源完整

spinningup.openai 强化学习资源完整

CreateAMind

6+阅读 · 2018年12月17日

Facebook PyText 在 Github 上开源了

Facebook PyText 在 Github 上开源了

AINLP

7+阅读 · 2018年12月14日

carla无人驾驶模拟中文项目 carla_simulator_Chinese

carla无人驾驶模拟中文项目 carla_simulator_Chinese

CreateAMind

3+阅读 · 2018年1月30日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

SaRoCo: Detecting Satire in a Novel Romanian Corpus of News Articles

Arxiv

0+阅读 · 2021年5月14日

Anomalous Instance Detection in Deep Learning: A Survey

Anomalous Instance Detection in Deep Learning: A Survey

Arxiv

29+阅读 · 2020年3月16日

Robust breast cancer detection in mammography and digital breast tomosynthesis using annotation-efficient deep learning approach

Robust breast cancer detection in mammography and digital breast tomosynthesis using annotation-efficient deep learning approach

Arxiv

14+阅读 · 2019年12月27日

Deep Learning for Deepfakes Creation and Detection

Deep Learning for Deepfakes Creation and Detection

Arxiv

6+阅读 · 2019年9月25日

Distributed Machine Learning on Mobile Devices: A Survey

Distributed Machine Learning on Mobile Devices: A Survey

Arxiv

37+阅读 · 2019年9月18日

Scene Text Detection and Recognition: The Deep Learning Era

Scene Text Detection and Recognition: The Deep Learning Era

Arxiv

27+阅读 · 2019年9月5日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

DeepFakes: a New Threat to Face Recognition? Assessment and Detection

Arxiv

6+阅读 · 2018年12月20日

Apple Flower Detection using Deep Convolutional Networks

Arxiv

3+阅读 · 2018年9月17日

Zero-Shot Object Detection

Arxiv

9+阅读 · 2018年4月12日

VIP会员

文章信息

相关主题

查准率/准确率

相关VIP内容

【MIT干货书】机器学习算法视角，126页pdf

【MIT干货书】机器学习算法视角，126页pdf

专知会员服务

78+阅读 · 2021年1月25日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【深度伪造综述论文】The Creation and Detection of Deepfakes: A Survey

【深度伪造综述论文】The Creation and Detection of Deepfakes: A Survey

专知会员服务

55+阅读 · 2020年4月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习相关资源(框架、库、软件)大列表

机器学习相关资源(框架、库、软件)大列表

专知会员服务

40+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

面向具身智能的多模态数据存储与检索：综述

《算法战争研究计划全景评估》35页

【CMU博士论文】水下三维视觉感知与生成

智能体战争：自主人工智能军备竞赛全景透视

相关资讯

异常检测（Anomaly Detection）综述

异常检测（Anomaly Detection）综述

极市平台

20+阅读 · 2020年10月24日

AI可解释性文献列表

AI可解释性文献列表

专知

42+阅读 · 2019年10月7日

【Github】All4NLP：自然语言处理相关资源整理

【Github】All4NLP：自然语言处理相关资源整理

AINLP

23+阅读 · 2019年8月9日

已删除

将门创投

5+阅读 · 2019年6月28日

CCF A类 | 顶级会议RTSS 2019诚邀稿件

CCF A类 | 顶级会议RTSS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年4月17日

TCN v2 + 3Dconv 运动信息

TCN v2 + 3Dconv 运动信息

CreateAMind

4+阅读 · 2019年1月8日

spinningup.openai 强化学习资源完整

spinningup.openai 强化学习资源完整

CreateAMind

6+阅读 · 2018年12月17日

Facebook PyText 在 Github 上开源了

Facebook PyText 在 Github 上开源了

AINLP

7+阅读 · 2018年12月14日

carla无人驾驶模拟中文项目 carla_simulator_Chinese

carla无人驾驶模拟中文项目 carla_simulator_Chinese

CreateAMind

3+阅读 · 2018年1月30日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

SaRoCo: Detecting Satire in a Novel Romanian Corpus of News Articles

Arxiv

0+阅读 · 2021年5月14日

Anomalous Instance Detection in Deep Learning: A Survey

Anomalous Instance Detection in Deep Learning: A Survey

Arxiv

29+阅读 · 2020年3月16日

Robust breast cancer detection in mammography and digital breast tomosynthesis using annotation-efficient deep learning approach

Robust breast cancer detection in mammography and digital breast tomosynthesis using annotation-efficient deep learning approach

Arxiv

14+阅读 · 2019年12月27日

Deep Learning for Deepfakes Creation and Detection

Deep Learning for Deepfakes Creation and Detection

Arxiv

6+阅读 · 2019年9月25日

Distributed Machine Learning on Mobile Devices: A Survey

Distributed Machine Learning on Mobile Devices: A Survey

Arxiv

37+阅读 · 2019年9月18日

Scene Text Detection and Recognition: The Deep Learning Era

Scene Text Detection and Recognition: The Deep Learning Era

Arxiv

27+阅读 · 2019年9月5日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

DeepFakes: a New Threat to Face Recognition? Assessment and Detection

Arxiv

6+阅读 · 2018年12月20日

Apple Flower Detection using Deep Convolutional Networks

Arxiv

3+阅读 · 2018年9月17日

Zero-Shot Object Detection

Arxiv

9+阅读 · 2018年4月12日

微信扫码咨询专知VIP会员