The Offline Software of the CMS Experiment at the Large Hadron Collider (LHC) at CERN consists of 6M lines of in-house code, developed over a decade by nearly 1000 physicists, as well as a comparable amount of general use open-source code. A critical ingredient to the success of the construction and early operation of the WLCG was the convergence, around the year 2000, on the use of a homogeneous environment of commodity x86-64 processors and Linux. Apache Mesos is a cluster manager that provides efficient resource isolation and sharing across distributed applications, or frameworks. It can run Hadoop, Jenkins, Spark, Aurora, and other applications on a dynamically shared pool of nodes. We present how we migrated our continuos integration system to schedule jobs on a relatively small Apache Mesos enabled cluster and how this resulted in better resource usage, higher peak performance and lower latency thanks to the dynamic scheduling capabilities of Mesos.
翻译:欧洲核研究组织大型强子对撞器(LHC)的CMS离线软件实验由近1 000个物理学家十多年来开发的6M内部代码线组成,以及相当数量的开源代码。WLCG的建造和早期运行取得成功的一个关键因素是,在2000年前后,在商品x86-64处理器和Linux的同质环境的使用上趋于一致。Apache Mesos是一个集束管理器,提供有效的资源隔离,在分布式应用程序或框架之间共享。它可以在动态共享的节点集合上运行Hadoop、Jenkins、Spark、Aurora和其他应用程序。我们介绍了我们如何将我们的同流体整合系统迁移到相对小的阿帕奇Mesos所促成的集群上,以及由于Mesos的动态排期能力,这如何导致资源使用、更高的峰值和低纬度。