This is the first work to report on inferential testing at scale in industry. Specifically, it reports the experience of automated testing of integrity systems at Meta. We built an internal tool called ALPACAS for automated inference of end-to-end integrity tests. Integrity tests are designed to keep users safe online by checking that interventions take place when harmful behaviour occurs on a platform. ALPACAS infers not only the test input, but also the oracle, by observing production interventions to prevent harmful behaviour. This approach allows Meta to automate the process of generating integrity tests for its platforms, such as Facebook and Instagram, which consist of hundreds of millions of lines of production code. We outline the design and deployment of ALPACAS, and report results for its coverage, number of tests produced at each stage of the test inference process, and their pass rates. Specifically, we demonstrate that using ALPACAS significantly improves coverage from a manual test design for the particular aspect of integrity end-to-end testing it was applied to. Further, from a pool of 3 million data points, ALPACAS automatically yields 39 production-ready end-to-end integrity tests. We also report that the ALPACAS-inferred test suite enjoys exceptionally low flakiness for end-to-end testing with its average in-production pass rate of 99.84%.
翻译:这是首次报告工业规模测价情况的工作。具体来说,它报告在梅塔自动测试完整系统的经验。我们建造了一个名为ALPASAS的内部工具,用于自动测价端到端的完整测试。完整性测试的目的是通过检查在平台发生有害行为时的干预措施,使用户在网上安全。ALPAS不仅推断测试输入,而且还推断神器,观察生产干预,以防止有害行为。这种方法使Meta能够自动测试其平台,如Facebook和Instagram等,该平台由数亿条生产代码组成的Facebook和Instagram。我们概述了ALPASAS的设计和部署情况,并报告了其覆盖范围、测试过程每个阶段的测试次数及其通过率。具体地说,我们证明使用ALPASASCAS大大改进了其完整性最终测试特定方面的手工测试设计范围。此外,从300万个数据点中,ALPASASAS自动得出39个生产前端端至端的完整度测试结果。我们还报告说,在ASARAA中,该测试了384年度的低度标准。我们还报告说,该测试了ASALAAAAAAAAAA标准中的平均端测试。