The notion of string attractor has been introduced in [Kempa and Prezza, 2018] in the context of Data Compression and it represents a set of positions of a finite word in which all of its factors can be "attracted". The smallest size $\gamma^*$ of a string attractor for a finite word is a lower bound for several repetitiveness measures associated with the most common compression schemes, including BWT-based and LZ-based compressors. The combinatorial properties of the measure $\gamma^*$ have been studied in [Mantaci et al., 2021]. Very recently, a complexity measure, called string attractor profile function, has been introduced for infinite words, by evaluating $\gamma^*$ on each prefix. Such a measure has been studied for automatic sequences and linearly recurrent infinite words [Schaeffer and Shallit, 2021]. In this paper, we study the relationship between such a complexity measure and other well-known combinatorial notions related to repetitiveness in the context of infinite words, such as the factor complexity and the recurrence. Furthermore, we introduce new string attractor-based complexity measures, in which the structure and the distribution of positions in a string attractor of the prefixes of infinite words are considered. We show that such measures provide a finer classification of some infinite families of words.
翻译:在数据压缩背景下,[Kempa和Prezza,2018年]引入了字符串吸引器的概念,这是一套限定词,其中所有因素都可以“吸引”的限定词。对于一个限定字来说,字符串吸引器最小的大小$\gamma ⁇ $,对于与最常见压缩计划相关的若干重复性措施来说,这种措施的重复性措施是比较低的,包括BWT和LZ的压缩机。在[Mantaci等人,2021年]中研究了该措施的组合特性$\gamma ⁇ $。最近,通过在每个前缀上评估$\gamma ⁇ $,采用了被称为字符串吸引器配置功能的一套复杂性措施,对无限字词进行了无限的介绍。此外,我们对自动序列和线性重复的无限重复性词进行了研究[Schaeffer和Shalit,2021年]。在本文中,我们研究了这种复杂度措施与其他众所周知的组合概念之间的关系,例如因素的复杂性和复现。此外,我们引入了一种无限的字符串吸引器结构和无限的分类。我们研究了这种无限的组合结构的分类。