We present a proof-by-simulation framework for rigorously controlling a trial design's operating characteristics over continuous regions of parameter space. We show that proof-by-simulation can achieve two major goals: (1) Calibrate a design for provable Type I Error control at a fixed level alpha. (2) Given a fixed design, bound (with high probability) its operating characteristics, such as the Type I Error, FDR, or bias of bounded estimators. This framework can handle adaptive sampling, nuisance parameters, administrative censoring, multiple arms, and multiple testing. These techniques, which we call Continuous Simu lation Extension (CSE), were first developed in Sklar (2021) to control Type I Error and FWER for designs where unknown parameters have exponential family likelihood. Our appendix improves those results with more efficient bounding and calibration methods, extends them to general operating characteristics including FDR and bias, and extends applicability to include canonical GLMs and some non-parametric problems. In the main paper we demonstrate our CSE approach and software on 3 examples (1) a gentle introduction, analyzing the z-test (2) a hierarchical Bayesian analysis of 4 treatments, with sample sizes fixed (3) an adaptive Bayesian Phase II-III selection design with 4 arms, where interim dropping and go/no-go decisions are based on a hierarchical model. Trillions of simulations were performed for the latter two examples, enabled by specialized INLA software. Open-source software is maintained at (1)
翻译:我们提出了一个对参数空间连续区域严格控制试验设计操作特点的逐个模拟框架,以严格控制试验设计对参数空间连续区域的操作特点,我们表明,逐个模拟可以实现两个主要目标:(1) 在固定的阿尔法水平上,对可验证型I错误控制的设计进行校准;(2) 在固定的设计中,对操作特点,如类型I错误、FDR或约束型估测器的偏差(可能性很大)加以约束(约束性很高),这一框架可以处理适应性抽样、妨害性参数、行政审查、多臂和多重测试。这些技术,我们称之为连续模拟扩展(CSE),首先在Sklar(2021年)开发,以控制类型I错误和FWER的设计设计设计,而未知参数具有指数性的家庭可能成倍数。(2) 我们的附录以更有效的约束和校准方法改进了这些结果,包括FDR和偏差,扩大适用性包括罐体GLM和一些非参数问题。在主要文件中,我们用CSE方法和软件在三个例子上展示了一种温的介绍,在Z-stest II (2) 在二级BA级BA 进行一个基于级BA级选择的测试分析。