The Internet is transforming our society, necessitating a quantitative understanding of Internet traffic. Our team collects and curates the largest publicly available Internet traffic data sets. An analysis of 50 billion packets using 10,000 processors in the MIT SuperCloud reveals a new phenomenon: the importance of otherwise unseen leaf nodes and isolated links in Internet traffic. Our analysis further shows that a two-parameter modified Zipf-Mandelbrot distribution accurately describes a wide variety of source/destination statistics on moving sample windows ranging from 100{,}000 to 100{,}000{,}000 packets over collections that span years and continents. The measured model parameters distinguish different network streams, and the model leaf parameter strongly correlates with the fraction of the traffic in different underlying network topologies.
翻译:互联网正在改变我们的社会,需要从数量上理解互联网的流量。我们的团队收集并整理了最大的公开的互联网流量数据集。对MIT SuperCloud中使用10,000个处理器的500亿个包的分析揭示了一个新的现象:互联网流量中隐蔽的叶节点和孤立链接的重要性。我们的分析进一步表明,用两个参数修改的Zipf-Mandelbrot分布准确描述了从100{,000到100{,}}1000{,000},千个样本窗口移动的多种来源/目的地统计数据,覆盖了不同年份和大陆的采集量。测量的模型参数区分了不同的网络流,而模型叶参数与不同基本网络地形的流量比例密切相关。