We study the learnability of symbolic finite state automata (SFA), a model shown useful in many applications in software verification. The state-of-the-art literature on this topic follows the query learning paradigm, and so far all obtained results are positive. We provide a necessary condition for efficient learnability of SFAs in this paradigm, from which we obtain the first negative result. The main focus of our work lies in the learnability of SFAs under the paradigm of identification in the limit using polynomial time and data, and its strengthening efficient identifiability, which are concerned with the existence of a systematic set of characteristic samples from which a learner can correctly infer the target language. We provide a necessary condition for identification of SFAs in the limit using polynomial time and data, and a sufficient condition for efficient learnability of SFAs. From these conditions we derive a positive and a negative result. The performance of a learning algorithm is typically bounded as a function of the size of the representation of the target language. Since SFAs, in general, do not have a canonical form, and there are trade-offs between the complexity of the predicates on the transitions and the number of transitions, we start by defining size measures for SFAs. We revisit the complexity of procedures on SFAs and analyze them according to these measures, paying attention to the special forms of SFAs: normalized SFAs and neat SFAs, as well as to SFAs over a monotonic effective Boolean algebra. This is an extended version of the paper with the same title published in CSL'22.
翻译:我们研究了符号有限状态自动机(SFA)的可学习性,这是一种在软件验证中被证明有用的模型。目前,这个领域的最新文献都采用了查询学习范式,迄今为止所有的结果都是积极的。我们提供了一个条件,说明采用这一范式对 SFAs 进行有效学习的必要性,从中得出了第一个消极的结果。我们的研究重点在于学习符号有限状态自动机的可推论性,在多项式时间和数据的限制下,以及其强化的有效可学习性。这些都关系到一个学习者是否存在一组系统性的典型样本,能够正确地推断目标语言。我们提供了 SFAs 在多项式时间和数据的极限识别范式下的必要条件,以及 SFAs 有效主干可学习性的充分条件。从这些条件中,我们得出了一个积极的和一个消极的结果。学习算法的性能通常会受到目标语言表示大小的限制。由于 SFAs 在一般情况下没有一个规范形式,并且转换谓词的复杂性和转换数量之间存在权衡,我们首先为 SFAs 定义了大小度量标准。我们重新审视了 SFAs 过程的复杂性,并根据这些度量标准对其进行了分析,特别关注了 SFAs 的特殊形式:归一化 SFAs 和整洁 SFAs,以及在单调有效布尔代数上的 SFAs。这是与同名论文 CSL'22 的扩展版本。