We revisit the classic online portfolio selection problem, where at each round a learner selects a distribution over a set of portfolios to allocate its wealth. It is known that for this problem a logarithmic regret with respect to Cover's loss is achievable using the Universal Portfolio Selection algorithm, for example. However, all existing algorithms that achieve a logarithmic regret for this problem have per-round time and space complexities that scale polynomially with the total number of rounds, making them impractical. In this paper, we build on the recent work by Haipeng et al. 2018 and present the first practical online portfolio selection algorithm with a logarithmic regret and whose per-round time and space complexities depend only logarithmically on the horizon. Behind our approach are two key technical novelties of independent interest. We first show that the Damped Online Newton steps can approximate mirror descent iterates well, even when dealing with time-varying regularizers. Second, we present a new meta-algorithm that achieves an adaptive logarithmic regret (i.e. a logarithmic regret on any sub-interval) for mixable losses.
翻译:我们重新审视了典型的在线投资组合选择问题, 每回合一位学习者都会选择一组投资组合的分布,以分配财富。 众所周知, 在这个问题上, 使用通用组合选择算法可以实现对封面损失的对数遗憾。 然而, 实现对这一问题的对数遗憾的所有现有算法都有时间和空间复杂性, 其规模与周期总数成倍地扩大, 因而不切实际。 在本文中, 我们以Haipeng 等人( Haipeng 等人( 2018) 的最新工作为基础, 展示了第一个实用的在线投资组合选择算法, 带有对数遗憾, 其每轮时间和空间复杂性仅取决于对数的地平线。 在我们的方法背后, 有两个独立的技术创新之处。 我们首先显示, Damped Online Newton 步骤可以接近反向下沉积, 即使在与时间变化的调整者打交道时, 也能够很好地显示它。 第二, 我们提出一个新的元算法, 能够实现可调整的对数损失的对数( i. e. alogricalphrimic recle) resour) 。