Apache Spark Streaming, Kafka 和 DyronicIO: 企业和科学计算机的性能和建筑比较 (Apache Spark Streaming, Kafka and HarmonicIO: A Performance and Architecture Comparison for Enterprise and Scientific Computing) - 专知论文

会员服务 ·

0

Performer · Stream Processing · Spark · Integration · 流 ·

2019 年 1 月 22 日

Apache Spark Streaming, Kafka and HarmonicIO: A Performance and Architecture Comparison for Enterprise and Scientific Computing

翻译：Apache Spark Streaming, Kafka 和 DyronicIO: 企业和科学计算机的性能和建筑比较

Ben Blamey,Andreas Hellander,Salman Toor

This paper presents a benchmark of stream processing throughput comparing Apache Spark Streaming (under file-, socket- and Kafka-based stream integration), with a prototype P2P stream processing framework, HarmonicIO. Maximum throughput for broad range of stream processing loads are measured, in particular, those with large message sizes (up to 10MB), and heavy CPU load -- loads more typical of scientific computing use cases (such as microscopy), than enterprise contexts. A detailed exploration of the performance characteristics of these integrations under varying loads reveals a complex interplay of performance trade-offs, uncovering the boundaries of good performance for each framework and integration. Based on these results, we suggest which frameworks and integrations are likely to offer good performance for a given load. Broadly, the advantages of Spark's rich feature set comes at a cost of sensitivity to message size in particular, whereas the simplicity of HarmonicIO offers more robust performance, especially for raw CPU utilization.

翻译：本文介绍了将Apache Spark Streaming(在文件、套接和基于Kafka的流流集成下)与原型P2P流处理框架 " 和谐组织 " 比较的溪流处理输送量基准。测量了广泛的溪流处理负荷的最大输送量,特别是信息大小大(高达10MB)和重的CPU负荷 -- -- 科学计算使用案例(如显微镜)比企业环境更典型的负荷。详细探讨不同负荷下这些集成的性能特点,揭示了业绩权衡的复杂相互作用,揭示了每个框架和集成的良好性能界限。基于这些结果,我们建议哪些框架和集成有可能为特定负荷提供良好的性能。广而言,Spark的丰富功能组合的优势是以对信息大小特别敏感为代价的,而HarmonicIO的简单化提供了更强有力的性能,特别是用于原始CPU的利用。

0

相关内容

Performer

【阿里巴巴达摩院】TResNet: 高性能的GPU专用架构，GPU-Dedicated Architecture

【阿里巴巴达摩院】TResNet: 高性能的GPU专用架构，GPU-Dedicated Architecture

专知会员服务

33+阅读 · 2020年4月1日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【新书】Java企业微服务，Enterprise Java Microservices，272页pdf

【新书】Java企业微服务，Enterprise Java Microservices，272页pdf

专知会员服务

53+阅读 · 2020年1月30日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

【TED】生命中的每一年的智慧

【TED】生命中的每一年的智慧

英语演讲视频每日一推

10+阅读 · 2019年1月29日

人工智能 | SCI期刊专刊信息3条

人工智能 | SCI期刊专刊信息3条

Call4Papers

5+阅读 · 2019年1月10日

大数据 | 顶级SCI期刊专刊/国际会议信息7条

大数据 | 顶级SCI期刊专刊/国际会议信息7条

Call4Papers

10+阅读 · 2018年12月29日

医学 | 顶级SCI期刊专刊/国际会议信息4条

医学 | 顶级SCI期刊专刊/国际会议信息4条

Call4Papers

5+阅读 · 2018年12月28日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

人工智能类 | 国际会议/SCI期刊专刊信息9条

人工智能类 | 国际会议/SCI期刊专刊信息9条

Call4Papers

4+阅读 · 2018年7月10日

人工智能 | 国际会议/SCI期刊约稿信息9条

人工智能 | 国际会议/SCI期刊约稿信息9条

Call4Papers

3+阅读 · 2018年1月12日

TResNet: High Performance GPU-Dedicated Architecture

TResNet: High Performance GPU-Dedicated Architecture

Arxiv

8+阅读 · 2020年3月30日

Towards Automated Machine Learning: Evaluation and Comparison of AutoML Approaches and Tools

Towards Automated Machine Learning: Evaluation and Comparison of AutoML Approaches and Tools

Arxiv

3+阅读 · 2019年9月3日

A Survey of the Recent Architectures of Deep Convolutional Neural Networks

A Survey of the Recent Architectures of Deep Convolutional Neural Networks

Arxiv

39+阅读 · 2019年1月17日

A Sentiment Analysis of Breast Cancer Treatment Experiences and Healthcare Perceptions Across Twitter

Arxiv

4+阅读 · 2018年5月25日

A Benchmark Study on Sentiment Analysis for Software Engineering Research

Arxiv

3+阅读 · 2018年3月17日

Baselines and test data for cross-lingual inference

Arxiv

3+阅读 · 2018年3月2日

Polypus: a Big Data Self-Deployable Architecture for Microblogging Text Extraction and Real-Time Sentiment Analysis

Arxiv

3+阅读 · 2018年1月11日

Integrating both Visual and Audio Cues for Enhanced Video Caption

Arxiv

4+阅读 · 2017年12月9日

A Big Data Analysis Framework Using Apache Spark and Deep Learning

Arxiv

3+阅读 · 2017年11月25日

Twitter Sentiment Analysis

Arxiv

5+阅读 · 2015年9月14日

VIP会员

文章信息

相关主题

Stream Processing

相关VIP内容

【阿里巴巴达摩院】TResNet: 高性能的GPU专用架构，GPU-Dedicated Architecture

【阿里巴巴达摩院】TResNet: 高性能的GPU专用架构，GPU-Dedicated Architecture

专知会员服务

33+阅读 · 2020年4月1日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【新书】Java企业微服务，Enterprise Java Microservices，272页pdf

【新书】Java企业微服务，Enterprise Java Microservices，272页pdf

专知会员服务

53+阅读 · 2020年1月30日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

大语言模型智能体强化学习：全景综述

《城市滨海地区：理解复杂多变环境下的指挥控制框架》50页报告

【伯克利博士论文】从推理服务到训练：面向大规模 LLM 智能体的高效系统

美空军“顶点2025”实验：推进AI在C2、动态目标锁定与联盟集成中的应用

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

【TED】生命中的每一年的智慧

【TED】生命中的每一年的智慧

英语演讲视频每日一推

10+阅读 · 2019年1月29日

人工智能 | SCI期刊专刊信息3条

人工智能 | SCI期刊专刊信息3条

Call4Papers

5+阅读 · 2019年1月10日

大数据 | 顶级SCI期刊专刊/国际会议信息7条

大数据 | 顶级SCI期刊专刊/国际会议信息7条

Call4Papers

10+阅读 · 2018年12月29日

医学 | 顶级SCI期刊专刊/国际会议信息4条

医学 | 顶级SCI期刊专刊/国际会议信息4条

Call4Papers

5+阅读 · 2018年12月28日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

人工智能类 | 国际会议/SCI期刊专刊信息9条

人工智能类 | 国际会议/SCI期刊专刊信息9条

Call4Papers

4+阅读 · 2018年7月10日

人工智能 | 国际会议/SCI期刊约稿信息9条

人工智能 | 国际会议/SCI期刊约稿信息9条

Call4Papers

3+阅读 · 2018年1月12日

相关论文

TResNet: High Performance GPU-Dedicated Architecture

TResNet: High Performance GPU-Dedicated Architecture

Arxiv

8+阅读 · 2020年3月30日

Towards Automated Machine Learning: Evaluation and Comparison of AutoML Approaches and Tools

Towards Automated Machine Learning: Evaluation and Comparison of AutoML Approaches and Tools

Arxiv

3+阅读 · 2019年9月3日

A Survey of the Recent Architectures of Deep Convolutional Neural Networks

A Survey of the Recent Architectures of Deep Convolutional Neural Networks

Arxiv

39+阅读 · 2019年1月17日

A Sentiment Analysis of Breast Cancer Treatment Experiences and Healthcare Perceptions Across Twitter

Arxiv

4+阅读 · 2018年5月25日

A Benchmark Study on Sentiment Analysis for Software Engineering Research

Arxiv

3+阅读 · 2018年3月17日

Baselines and test data for cross-lingual inference

Arxiv

3+阅读 · 2018年3月2日

Polypus: a Big Data Self-Deployable Architecture for Microblogging Text Extraction and Real-Time Sentiment Analysis

Arxiv

3+阅读 · 2018年1月11日

Integrating both Visual and Audio Cues for Enhanced Video Caption

Arxiv

4+阅读 · 2017年12月9日

A Big Data Analysis Framework Using Apache Spark and Deep Learning

Arxiv

3+阅读 · 2017年11月25日

Twitter Sentiment Analysis

Arxiv

5+阅读 · 2015年9月14日

微信扫码咨询专知VIP会员