发现网络垃圾邮件分类模式的系统化框架 (A systematic framework to discover pattern for web spam classification) - 专知论文

会员服务 ·

0

Engineering · 必应搜索 · 决策树 · 必应（Bing） · MoDELS ·

2017 年 11 月 19 日

A systematic framework to discover pattern for web spam classification

翻译：发现网络垃圾邮件分类模式的系统化框架

Hamed Jelodar,Yongli Wang,Chi Yuan,Xiaohui Jiang

from arxiv, Proceedings of IEEE IEMCON 2017

Web spam is a big problem for search engine users in World Wide Web. They use deceptive techniques to achieve high rankings. Although many researchers have presented the different approach for classification and web spam detection still it is an open issue in computer science. Analyzing and evaluating these websites can be an effective step for discovering and categorizing the features of these websites. There are several methods and algorithms for detecting those websites, such as decision tree algorithm. In this paper, we present a systematic framework based on CHAID algorithm and a modified string matching algorithm (KMP) for extract features and analysis of these websites. We evaluated our model and other methods with a dataset of Alexa Top 500 Global Sites and Bing search engine results in 500 queries.

翻译：网络垃圾邮件是万维网搜索引擎用户面临的一个大问题。他们使用欺骗性技术达到高分。虽然许多研究人员提出了分类和网络垃圾邮件检测的不同方法, 但它仍然是计算机科学中的一个开放问题。分析和评估这些网站可以成为发现和分类这些网站特征的有效步骤。检测这些网站有多种方法和算法, 例如决策树算法。在本文中, 我们提出了一个基于CHAID算法的系统框架, 和一个修改的字符串匹配算法( KMP), 用于提取这些网站的特征和分析。我们用Alexa 500顶全球站点和Bing搜索引擎的数据集对我们的模型和其他方法进行了500个查询结果的评估。

0

相关内容

Engineering

《工程》是中国工程院（CAE）于2015年推出的国际开放存取期刊。其目的是提供一个高水平的平台，传播和分享工程研发的前沿进展、当前主要研究成果和关键成果；报告工程科学的进展，讨论工程发展的热点、兴趣领域、挑战和前景，在工程中考虑人与环境的福祉和伦理道德，鼓励具有深远经济和社会意义的工程突破和创新，使之达到国际先进水平，成为新的生产力，从而改变世界，造福人类，创造新的未来。期刊链接：https://www.sciencedirect.com/journal/engineering

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

机器学习相关资源(框架、库、软件)大列表

机器学习相关资源(框架、库、软件)大列表

专知会员服务

40+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【报告推荐】线上食品推荐中的数据分析（Computational Data Analytics on the Web for Better Food Decision Making）

【报告推荐】线上食品推荐中的数据分析（Computational Data Analytics on the Web for Better Food Decision Making）

专知会员服务

16+阅读 · 2019年10月2日

CCF A类 | 顶级会议RTSS 2019诚邀稿件

CCF A类 | 顶级会议RTSS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年4月17日

人工智能 | ISAIR 2019诚邀稿件（推荐SCI期刊）

人工智能 | ISAIR 2019诚邀稿件（推荐SCI期刊）

Call4Papers

6+阅读 · 2019年4月1日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

人工智能 | CCF推荐期刊专刊约稿信息6条

人工智能 | CCF推荐期刊专刊约稿信息6条

Call4Papers

5+阅读 · 2019年2月18日

人工智能 | SCI期刊专刊信息3条

人工智能 | SCI期刊专刊信息3条

Call4Papers

5+阅读 · 2019年1月10日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Python机器学习教程资料/代码

Python机器学习教程资料/代码

机器学习研究会

8+阅读 · 2018年2月22日

【推荐】MXNet深度情感分析实战

【推荐】MXNet深度情感分析实战

机器学习研究会

16+阅读 · 2017年10月4日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

GIANT: Scalable Creation of a Web-scale Ontology

Arxiv

10+阅读 · 2020年4月5日

A Survey of Deep Learning for Scientific Discovery

A Survey of Deep Learning for Scientific Discovery

Arxiv

29+阅读 · 2020年3月26日

A survey on Semi-, Self- and Unsupervised Techniques in Image Classification

A survey on Semi-, Self- and Unsupervised Techniques in Image Classification

Arxiv

100+阅读 · 2020年2月20日

Mining Disinformation and Fake News: Concepts, Methods, and Recent Advancements

Mining Disinformation and Fake News: Concepts, Methods, and Recent Advancements

Arxiv

16+阅读 · 2020年1月2日

Deep Learning for Deepfakes Creation and Detection

Deep Learning for Deepfakes Creation and Detection

Arxiv

6+阅读 · 2019年9月25日

Text Classification Algorithms: A Survey

Arxiv

6+阅读 · 2019年4月25日

Interpretable machine learning: definitions, methods, and applications

Interpretable machine learning: definitions, methods, and applications

Arxiv

19+阅读 · 2019年1月14日

LNEMLC: Label Network Embeddings for Multi-Label Classification

Arxiv

4+阅读 · 2019年1月1日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

VIP会员

文章信息

相关主题

必应（Bing）

相关VIP内容

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

机器学习相关资源(框架、库、软件)大列表

机器学习相关资源(框架、库、软件)大列表

专知会员服务

40+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【报告推荐】线上食品推荐中的数据分析（Computational Data Analytics on the Web for Better Food Decision Making）

【报告推荐】线上食品推荐中的数据分析（Computational Data Analytics on the Web for Better Food Decision Making）

专知会员服务

16+阅读 · 2019年10月2日

热门VIP内容

开通专知VIP会员享更多权益服务

《美陆军徒步机动作战条令手册》最新168页

【博士论文】基于不确定性的可靠性：现代机器学习中的选择性预测与可信部署

军事后勤数字化未来展望

《美海军后勤体系整合与创新挑战》最新报告

相关资讯

CCF A类 | 顶级会议RTSS 2019诚邀稿件

CCF A类 | 顶级会议RTSS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年4月17日

人工智能 | ISAIR 2019诚邀稿件（推荐SCI期刊）

人工智能 | ISAIR 2019诚邀稿件（推荐SCI期刊）

Call4Papers

6+阅读 · 2019年4月1日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

人工智能 | CCF推荐期刊专刊约稿信息6条

人工智能 | CCF推荐期刊专刊约稿信息6条

Call4Papers

5+阅读 · 2019年2月18日

人工智能 | SCI期刊专刊信息3条

人工智能 | SCI期刊专刊信息3条

Call4Papers

5+阅读 · 2019年1月10日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Python机器学习教程资料/代码

Python机器学习教程资料/代码

机器学习研究会

8+阅读 · 2018年2月22日

【推荐】MXNet深度情感分析实战

【推荐】MXNet深度情感分析实战

机器学习研究会

16+阅读 · 2017年10月4日

相关论文

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

GIANT: Scalable Creation of a Web-scale Ontology

Arxiv

10+阅读 · 2020年4月5日

A Survey of Deep Learning for Scientific Discovery

A Survey of Deep Learning for Scientific Discovery

Arxiv

29+阅读 · 2020年3月26日

A survey on Semi-, Self- and Unsupervised Techniques in Image Classification

A survey on Semi-, Self- and Unsupervised Techniques in Image Classification

Arxiv

100+阅读 · 2020年2月20日

Mining Disinformation and Fake News: Concepts, Methods, and Recent Advancements

Mining Disinformation and Fake News: Concepts, Methods, and Recent Advancements

Arxiv

16+阅读 · 2020年1月2日

Deep Learning for Deepfakes Creation and Detection

Deep Learning for Deepfakes Creation and Detection

Arxiv

6+阅读 · 2019年9月25日

Text Classification Algorithms: A Survey

Arxiv

6+阅读 · 2019年4月25日

Interpretable machine learning: definitions, methods, and applications

Interpretable machine learning: definitions, methods, and applications

Arxiv

19+阅读 · 2019年1月14日

LNEMLC: Label Network Embeddings for Multi-Label Classification

Arxiv

4+阅读 · 2019年1月1日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

微信扫码咨询专知VIP会员