Incorporating syntax into neural approaches in NLP has a multitude of practical and scientific benefits. For instance, a language model that is syntax-aware is likely to be able to produce better samples; even a discriminative model like BERT with a syntax module could be used for core NLP tasks like unsupervised syntactic parsing. Rapid progress in recent years was arguably spurred on by the empirical success of the Parsing-Reading-Predict architecture of (Shen et al., 2018a), later simplified by the Order Neuron LSTM of (Shen et al., 2019). Most notably, this is the first time neural approaches were able to successfully perform unsupervised syntactic parsing (evaluated by various metrics like F-1 score). However, even heuristic (much less fully mathematical) understanding of why and when these architectures work is lagging severely behind. In this work, we answer representational questions raised by the architectures in (Shen et al., 2018a, 2019), as well as some transition-based syntax-aware language models (Dyer et al., 2016): what kind of syntactic structure can current neural approaches to syntax represent? Concretely, we ground this question in the sandbox of probabilistic context-free-grammars (PCFGs), and identify a key aspect of the representational power of these approaches: the amount and directionality of context that the predictor has access to when forced to make parsing decision. We show that with limited context (either bounded, or unidirectional), there are PCFGs, for which these approaches cannot represent the max-likelihood parse; conversely, if the context is unlimited, they can represent the max-likelihood parse of any PCFG.
翻译:将语法纳入 NLP 的神经方法中, 有很多实际和科学的好处。 例如, 一种语言模式, 即 语法意识的LSTM( Shen et al., 2019 ), 很可能能够生成更好的样本; 甚至像 BERT 这样的带有语法模块的歧视性模式, 也可以用于核心 NLP 任务, 比如不受监督的合成分析。 但是, 最近几年的快速进步可以说是来自Parsing- Reading- Predicit 架构( Shen et al., 2018a, ) 的成功经验成功的激励。 后来由 Neuron LSTM( Shen et al., 2019 ) 命令简化的LSTM( shen etal- Al- al? 最显著的是, 这是第一次能够成功执行不受监督的语法平衡模型( 由F-1 评分的多种参数来评估 ) 。 然而, 即使是超自然( ) 理解( ), 这些建筑工程( ) 也可以理解( ) 直观) 或直观 直观 直观 直观 直观 直观 直观 表达 的语法系 的语系 方法,, 直系 直系 直系 直系 直系 直系 直系 直系 。