We prove that the mixing time of the No-U-Turn Sampler (NUTS), when initialized in the concentration region of the canonical Gaussian measure, scales as $d^{1/4}$, up to logarithmic factors, where $d$ is the dimension. This scaling is expected to be sharp. This result is based on a coupling argument that leverages the geometric structure of the target distribution. Specifically, concentration of measure results in a striking uniformity in NUTS' locally adapted transitions, which holds with high probability. This uniformity is formalized by interpreting NUTS as an accept/reject Markov chain, where the mixing properties for the more uniform accept chain are analytically tractable. Additionally, our analysis uncovers a previously unnoticed issue with the path length adaptation procedure of NUTS, specifically related to looping behavior, which we address in detail.
翻译:暂无翻译