The structure of IP addresses observed in Internet traffic plays a critical role for a wide range of networking problems of current interest. For example, modern network telemetry systems that take advantage of existing data plane technologies for line rate traffic monitoring and processing cannot afford to waste precious data plane resources on traffic that comes from "uninteresting" regions of the IP address space. However, there is currently no well-established structural model or analysis toolbox that enables a first-principles approach to the specific problem of identifying "uninteresting" regions of the address space or the myriad of other networking problems that prominently feature IP addresses. To address this key missing piece, we present in this paper a first-of-its-kind empirically validated physical explanation for why the observed IP address structure in measured Internet traffic is multifractal in nature. Our root cause analysis overcomes key limitations of mostly forgotten findings from ~20 years ago and demonstrates that the Internet processes and mechanisms responsible for how IP addresses are allocated, assigned, and used in today's Internet are consistent with and well modeled by a class of evocative mathematical models called conservative cascades. We complement this root cause analysis with the development of an improved toolbox that is tailor-made for analyzing finite and discrete sets of IP addresses and includes statistical estimators that engender high confidence in the inferences they produce. We illustrate the use of this toolbox in the context of a novel address structure anomaly detection method we designed and conclude with a discussion of a range of challenging open networking problems that are motivated or inspired by our findings.
翻译:暂无翻译