通过低级别优化学习Markov模式 (Learning Markov models via low-rank optimization)

Modeling unknown systems from data is a precursor of system optimization and sequential decision making. In this paper, we focus on learning a Markov model from a single trajectory of states. Suppose that the transition model has a small rank despite of having a large state space, meaning that the system admits a low-dimensional latent structure. We show that one can estimate the full transition model accurately using a trajectory of length that is proportional to the total number of states. We propose two maximum likelihood estimation methods: a convex approach with nuclear-norm regularization and a nonconvex approach with rank constraint. We explicitly derive the statistical rates of both estimators in terms of the Kullback-Leiber divergence and the $\ell_2$ error and also establish a minimax lower bound to assess the tightness of these rates. For computing the nonconvex estimator, we develop a novel DC (difference of convex function) programming algorithm that starts with the convex M-estimator and then successively refines the solution till convergence. Empirical experiments demonstrate consistent superiority of the nonconvex estimator over the convex one.

翻译：从数据中建模未知系统是系统优化和顺序决策的先导。在本文中, 我们侧重于从单一的国家轨迹中学习Markov 模型。假设过渡模型尽管有较大的状态空间, 但其排名较低, 意味着系统接受低维潜伏结构。我们显示, 可以使用与国家总数成正比的长度轨迹准确估算整个过渡模型。我们提出两种最大可能性估算方法: 一种是核中枢调节法, 另一种是等级限制的非对流法。我们以 Kullback- Leiber 的差异和 $\ ell_ 2$ 错误来明确得出两个测算器的统计率, 并且还建立一个小负缩微摩克斯, 以评估这些比率的紧凑性。为了计算非convex 估计器, 我们开发了一个新颖的DC( convex 函数的宽度) 编程算算法, 它从 convex M- 估测算器开始, 然后连续调整解算法, 直至趋同。实验显示非 convex 。