Response-adaptive randomization (RAR) can increase participant benefit in clinical trials, but also complicates statistical analysis. The burn-in period (a non-adaptive initial stage) is commonly used to mitigate this disadvantage, yet guidance on its optimal duration is scarce. To address this critical gap, this paper introduces an exact evaluation approach to investigate how the burn-in length impacts statistical operating characteristics of two-arm binary Bayesian RAR (BRAR) designs. We show that (1) commonly used calibration and asymptotic tests show substantial type I error rate inflation for BRAR designs without a burn-in period, and increasing the total burn-in length to more than half the trial size reduces but does not fully mitigate type I error rate inflation, necessitating exact tests; (2) exact tests conditioning on total successes show the highest average and minimum power up to large burn-in lengths; (3) the burn-in length substantially influences power and participant benefit, which are often not maximized at the maximum or minimum possible burn-in length; (4) the test statistic influences the type I error rate and power; (5) estimation bias decreases quicker in the burn-in length for larger treatment effects and increases for larger trial sizes under the same burn-in length. Our approach is illustrated by re-designing the ARREST trial.
翻译:暂无翻译