More and more distributed software systems are being developed and deployed today. Like other software, distributed software systems also need very strong quality assurance support. Distributed software is often very large/complex, has distributed components, and does not have a global clock. All these characteristics make it very challenging to analyze the information flow of such systems to support the software quality assurance. One challenge is that existing dynamic analysis techniques hardly scale to large distributed software systems in the real world. It is also challenging to develop cost-effective dynamic analysis approaches. There are also applicability and portability challenges for dynamic analysis algorithms/applications of distributed software. My dissertation addresses these challenges via three novel approaches to data flow analysis for distributed software. My first approach is based on measuring interprocess communications to understand distributed software behaviors and predict distributed software quality. Then, I developed a particular approach that can actually pinpoint sensitive information via multi-staged and refinement-based dynamic information flow analysis for distributed software. Finally, I explored dynamic dependence analysis for distributed systems, utilizing reinforcement learning to automatically adjust analysis configurations for scalability and better cost-effectiveness tradeoffs.
翻译:与其它软件一样,分布式软件系统也需要非常强大的质量保证支持。分布式软件通常非常大/复杂,分布式软件通常非常庞大/复杂,分布式软件分布式组件,没有全球时钟。所有这些特点都使得分析这些系统的信息流动以支持软件质量保证非常困难。一个挑战是现有的动态分析技术几乎无法在现实世界中推广到大型分布式软件系统。开发成本效益高的动态分析方法也具有挑战性。对于分布式软件的动态分析算法/应用,也存在可适用性和可移动性挑战。我的论文通过三种新的方法,对分布式软件的数据流分析来应对这些挑战。我的第一个方法是测量程序间通信,以了解分布式软件的行为并预测分布式软件的质量。然后,我开发了一种特别的方法,能够通过多阶段和完善的基于改进的动态信息流动分析来实际定位敏感信息。最后,我探索了分布式软件的动态依赖性分析,利用强化学习自动调整分析配置,以调整可缩放性和更好的成本效益取舍。</s>