From simulating galaxy formation to viral transmission in a pandemic, scientific models play a pivotal role in developing scientific theories and supporting government policy decisions that affect us all. Given these critical applications, a poor modelling assumption or bug could have far-reaching consequences. However, scientific models possess several properties that make them notoriously difficult to test, including a complex input space, long execution times, and non-determinism, rendering existing testing techniques impractical. In fields such as epidemiology, where researchers seek answers to challenging causal questions, a statistical methodology known as Causal Inference has addressed similar problems, enabling the inference of causal conclusions from noisy, biased, and sparse data instead of costly experiments. This paper introduces the Causal Testing Framework: a framework that uses Causal Inference techniques to establish causal effects from existing data, enabling users to conduct software testing activities concerning the effect of a change, such as Metamorphic Testing and Sensitivity Analysis, a posteriori. We present three case studies covering real-world scientific models, demonstrating how the Causal Testing Framework can infer test outcomes from reused, confounded test data to provide an efficient solution for testing scientific modelling software.
翻译:从模拟银河形成到在大流行病中病毒传播,科学模型在开发科学理论和支持影响我们所有人的政府决策方面发挥着关键作用。鉴于这些关键应用,模型假设或错误可能具有深远影响。然而,科学模型具有数种性质,使得它们难以测试,包括复杂的输入空间、长期执行时间和不确定性,使现有的测试技术不切实际。在流行病学等领域,研究人员寻求对具有挑战性的因果关系问题的答案,称为“因果关系推断”的统计方法解决了类似的问题,使得能够推断来自噪音、偏差和稀少的数据的因果关系结论,而不是昂贵的实验。本文介绍了“构造测试框架”:一个框架,它利用“原因推断”技术确定现有数据产生的因果关系,使用户能够就变化的影响进行软件测试活动,例如后传式测试和感知性分析。我们介绍了三个案例研究,涉及现实世界的科学模型,我们展示了Caus测试框架如何从再利用、虚拟测试数据来推断测试结果,为测试科学模型提供有效的解决方案。