A lens is a single program that specifies two data transformations at once: one transformation converts data from source format to target format and a second transformation inverts the process. Over the past decade, researchers have developed many different kinds of lenses with different properties. One class of such languages operate over regular languages. In other words, these lenses convert strings drawn from one regular language to strings drawn from another regular language (and back again). In this paper, we define a more powerful language of lenses, which we call match-reference lenses, that is capable of translating between non-regular formats that contain repeated substrings, which is a primitive form of dependency. To define the non-regular formats themselves, we develop a new language, match-reference regular expressions, which are regular expressions that can bind variables to substrings and use those substrings repeatedly. These match-reference regular expressions are closely related to the familiar ``back-references" that can be found in traditional regular expression packages, but are redesigned to adhere to conventional programming language lexical scoping conventions and to interact smoothly with lens language infrastructure. We define the semantics of match-reference regular expressions and match-reference lenses. We also define a new kind of automaton, the match-reference regex automaton system (MRRAS), for deciding string membership in the language match-reference regular expressions. We illustrate our definitions with a variety of examples.
翻译:镜头是一个单一程序, 它同时指定两种数据转换: 一种转换将数据从源格式转换为目标格式, 另一种转换将进程转换为目标格式。 在过去的十年里, 研究人员开发了多种不同特性的镜象。 一类这种语言在常规语言中运作。 换句话说, 这些镜象将从一种常规语言中提取的字符串转换为从另一种常规语言( 后又一次) 的字符串。 在本文中, 我们定义了一种更强大的镜象语言语言语言语言, 我们称之为匹配的镜象, 能够在含有重复的子字符串的非正规格式之间转换, 这是一种原始的依赖形式。 为了定义非常规格式本身, 我们开发了一种新的语言, 匹配的常规表达方式, 可以将变量与子字符串连接, 并反复使用这些子字符串字符串。 这些匹配的常规表达方式与我们在传统常规表达式组合中可以找到的熟悉的“ 回溯参照” 密切相关, 但是我们经过重新设计, 遵守传统的编程语言缩缩缩缩定义, 并与镜语言基础设施进行顺利互动 。 我们定义了匹配的字符串定义了匹配的字符串定义, 常规表达式定义, 常规格式定义的缩缩缩缩略图比 。