Deciding what constitutes a single module, what classes belong to which module or the right set of modules for a specific software system has always been a challenging task. The problem is even harder in large-scale software systems composed of thousands of classes and hundreds of modules. Over the years, researchers have been proposing different techniques to support developers in re-modularizing their software systems. In particular, the search-based software re-modularization is an active research topic within the software engineering community for more than 20 years. This paper describes our efforts in applying search-based software re-modularization approaches at Adyen, a large-scale payment company. Adyen's code base has 5.5M+ lines of code, split into around 70 different modules. We leveraged the existing body of knowledge in the field to devise our own search algorithm and applied it to our code base. Our results show that search-based approaches scale to large code bases as ours. Our algorithm can find solutions that improve the code base according to the metrics we optimize for, and developers see value in the recommendations. Based on our experiences, we then list a set of challenges and opportunities for future researchers, aiming at making search-based software re-modularization more efficient for large-scale software companies.
翻译:确定哪个模块构成单一模块,哪个类别属于哪个模块或哪个特定软件系统正确的模块组,这始终是一项具有挑战性的任务。在由数千个类和数百个模块组成的大型软件系统中,问题更为严重。多年来,研究人员一直在提出不同技术,以支持开发者对其软件系统进行重新调整。特别是,基于搜索的软件再调制是软件工程界20多年来的一个积极研究课题。本文描述了我们在大型支付公司Adyen应用基于搜索的软件再调和方法的努力。Adyen的代码库有5.5M+的代码线,分为70多个不同的模块。我们利用实地现有知识群来设计我们自己的搜索算法并将其应用于我们的代码库。我们的结果显示,搜索的软件规模是作为我们软件库的一个动态。我们的算法可以找到解决方案,根据我们优化的尺度改进代码库,而开发者则看到建议的价值。根据我们的经验,我们随后列出了一组挑战与机会,供未来的研究人员使用,目的是进行更高效的大规模软件改造。