The basic Multi-Armed Bandits (MABs) problem is trying to maximize the rewards obtained from bandits with different unknown probability distributions of payoff for pulling different arms, given that only a finite number of attempts can be made. When studying trading algorithms in the market, we are looking at one of the most complex variants of MABs problems, namely the Non-stationary Continuum Bandits (NCBs) problem. The Bristol Stock Exchange (BSE) is a simple simulation of an electronic financial exchange based on a continuous double auction running via a limit order book. The market can be populated by automated trader agents with different trading algorithms. Within them, the PRSH algorithm embodies some basic ideas for solving NCBs problems. However, it faces the difficulty to adjust hyperparameters and adapt to changes in complex market conditions. We propose a new algorithm called PRB, which solves Continuum Bandits problem by Bayesian optimization, and solves Non-stationary Bandits problem by a novel "bandit-over-bandit" framework. With BSE, we use as many kinds of trader agents as possible to simulate the real market environment under two different market dynamics. We then examine the optimal hyperparameters of the PRSH algorithm and the PRB algorithm under different market dynamics respectively. Finally, by having trader agents using both algorithms trade in the market at the same time, we demonstrate that the PRB algorithm has better performance than the PRSH algorithm under both market dynamics. In particular, we perform rigorous hypothesis testing on all experimental results to ensure their correctness.
翻译:多武装盗匪(MABs)的基本问题正在试图尽量扩大从持不同武器而分配不同不为人知的利润概率分布的匪徒那里获得的收益,因为只有为数有限的尝试才能做到。在研究市场交易算法时,我们正在研究MABs问题中最复杂的变体之一,即非静止的Continuum盗匪(NCBs)问题。布里斯托尔股票交易所(BSE)是一个简单的电子金融交易所模拟,其基础是通过限制订单簿进行连续的双级拍卖。市场可以由具有不同贸易动态的自动交易代理商挤满市场。在其中,PRSH算法包含一些解决NBS问题的基本想法。然而,它面临着调整超参数和适应复杂市场条件变化的难度。我们提出了一个新的算法,即用Bayesian最优化解决Conduum盗匪问题,用新的“Bandrobit-bandroit-bit”框架来解决非固定的PSBSbritits问题。随着BS的自动交易代理商的自动交易算法的运行,我们在两个不同的市场里,我们使用不同的SBRBRDRDRLA,在不同的市场中,在不同的市场中可以模拟中,用不同的市场中,在不同的市场中,在不同的市场中,用不同的市场中,用不同的交易代理商的模拟中,用不同的SLDLLFSLDRFs。