The last success problem is an optimal stopping problem that aims to maximize the probability of stopping on the last success in a sequence of independent $n$ Bernoulli trials. In the classical setting where complete information about the distributions is available, Bruss~\cite{B00} provided an optimal stopping policy that ensures a winning probability of $1/e$. However, assuming complete knowledge of the distributions is unrealistic in many practical applications. This paper investigates a variant of the last success problem where samples from each distribution are available instead of complete knowledge of them. When a single sample from each distribution is allowed, we provide a deterministic policy that guarantees a winning probability of $1/4$. This is best possible by the upper bound provided by Nuti and Vondr\'{a}k~\cite{NV23}. Furthermore, for any positive constant $\epsilon$, we show that a constant number of samples from each distribution is sufficient to guarantee a winning probability of $1/e-\epsilon$.
翻译:暂无翻译