This paper is concerned with the computational complexity of learning the Hidden Markov Model (HMM). Although HMMs are some of the most widely used tools in sequential and time series modeling, they are cryptographically hard to learn in the standard setting where one has access to i.i.d. samples of observation sequences. In this paper, we depart from this setup and consider an interactive access model, in which the algorithm can query for samples from the conditional distributions of the HMMs. We show that interactive access to the HMM enables computationally efficient learning algorithms, thereby bypassing cryptographic hardness. Specifically, we obtain efficient algorithms for learning HMMs in two settings: (a) An easier setting where we have query access to the exact conditional probabilities. Here our algorithm runs in polynomial time and makes polynomially many queries to approximate any HMM in total variation distance. (b) A harder setting where we can only obtain samples from the conditional distributions. Here the performance of the algorithm depends on a new parameter, called the fidelity of the HMM. We show that this captures cryptographically hard instances and previously known positive results. We also show that these results extend to a broader class of distributions with latent low rank structure. Our algorithms can be viewed as generalizations and robustifications of Angluin's $L^*$ algorithm for learning deterministic finite automata from membership queries.
翻译:本文关注学习隐藏 Markov 模型( HMM) 的计算复杂性。 虽然 HMM 是连续和时间序列建模中最广泛使用的工具之一, 但是在标准设置中, 当您能够访问 i. d. 观察序列样本时, HMM 是很难在密码学上学习的。 在本文中, 我们离开此设置, 并考虑交互式访问模式, 算法可以从 HMM 有条件分布中查询样本。 我们显示, 互动访问 HMM 能够实现计算效率高的学习算法, 从而绕过加密的硬性。 具体地说, 我们获得在两个设置中学习 HMM 的有效算法 : (a) 在标准设置中, 我们可以查询确切的有条件的概率概率。 我们的算法在多数值中运行, 并且从多数值上可以查询任何 HMMMM 。 (b) 更难设定我们只能从有条件分布中获得的样本。 在这里, 算法的功能取决于一个新的参数, 叫做 HMMMM 的忠实性 。 我们展示了这种精确的排序结构结构, 以及我们以前所知道的直观的高级算法的直观分析结果。</s>