Distributed digital infrastructures for computation and analytics are now evolving towards an interconnected ecosystem allowing complex applications to be executed from IoT Edge devices to the HPC Cloud (aka the Computing Continuum, the Digital Continuum, or the Transcontinuum). Understanding end-to-end performance in such a complex continuum is challenging. This breaks down to reconciling many, typically contradicting application requirements and constraints with low-level infrastructure design choices. One important challenge is to accurately reproduce relevant behaviors of a given application workflow and representative settings of the physical infrastructure underlying this complex continuum. We introduce a rigorous methodology for such a process and validate it through E2Clab. It is the first platform to support the complete experimental cycle across the Computing Continuum: deployment, analysis, optimization. Preliminary results with real-life use cases show that E2Clab allows one to understand and improve performance, by correlating it to the parameter settings, the resource usage and the specifics of the underlying infrastructure.
翻译:用于计算和分析的分布式数字基础设施正在向一个相互关联的生态系统发展,使从IoT Edge设备到HPC云层(包括计算机连续、数字连续、或中continuum)的复杂应用能够执行。了解这样一个复杂连续体的端到端的性能具有挑战性。这分解为调和许多,通常与应用要求和制约因素相矛盾,与低级基础设施设计选择相矛盾。一个重大挑战是准确复制某个特定应用工作流程的相关行为和构成这一复杂连续体的有形基础设施的代表性设置。我们为这一过程采用了严格的方法,并通过E2Clab加以验证。这是第一个支持整个Econtinuum计算机连续:部署、分析、优化试验周期的平台。与实际使用案例的初步结果表明,E2Clab通过将其与参数设置、资源使用和基础基础设施的具体特点联系起来,使人们能够理解和改进性能。