We explore online, inductive transfer learning, with a feature representation transfer from a radial basis function network, which is formed of Gaussian mixture model hidden processing units, whose output is made available to a direct, recurrent reinforcement learning agent. This recurrent reinforcement learning agent learns a desired position, via the policy gradient reinforcement learning paradigm. This transfer learner is put to work trading the major spot market currency pairs. In our experiment, we accurately account for transaction and funding costs. These sources of profit and loss, including the price trends that occur in the currency markets, are made available to the recurrent reinforcement learner via a quadratic utility, who learns to target a position directly. We improve upon earlier work by casting the problem of learning to target a risk position, in an online transfer learning context. Our agent achieves an annualised portfolio information ratio of 0.52 with compound return of 9.3\%, net of execution and funding cost, over a 7 year test set. This is despite forcing the model to trade at the close of the trading day 5pm EST, when trading costs are statistically the most expensive.
翻译:我们通过在线、感应式的转移学习,从一个辐射基础功能网络进行特征代表转移,由高山混合模型隐藏的处理器组成,其产出提供给直接、经常性的强化学习机构。这个经常性强化学习机构通过政策梯度强化学习模式学习理想位置。这个转移学习者被安排在网上交易主要的现货货币对子。在我们的实验中,我们准确地记账交易和筹资费用。这些利润和损失来源,包括货币市场的价格趋势,通过一个二次工具提供给经常性强化学习者,他们学习直接瞄准一个位置。我们在早期工作中有所改进,在网上转移学习过程中将学习问题定位为风险位置。我们的机构实现了每年组合组合信息比率0.52,扣除执行和筹资成本9.3 ⁇ 的复合回报率,超过7年的测试。这迫使模式在交易日5点EST结束时进行交易,因为交易费用在统计上是最昂贵的。