The celebrated proverb that "speech is silver, silence is golden" has a long multinational history and multiple specific meanings. In written texts punctuation can in fact be considered one of its manifestations. Indeed, the virtue of effectively speaking and writing involves - often decisively - the capacity to apply the properly placed breaks. In the present study, based on a large corpus of world-famous and representative literary texts in seven major Western languages, it is shown that the distribution of intervals between consecutive punctuation marks in almost all texts can universally be characterised by only two parameters of the discrete Weibull distribution which can be given an intuitive interpretation in terms of the so-called hazard function. The values of these two parameters tend to be language-specific, however, and even appear to navigate translations. The properties of the computed hazard functions indicate that among the studied languages, English turns out to be the least constrained by the necessity to place a consecutive punctuation mark to partition a sequence of words. This may suggest that when compared to other studied languages, English is more flexible, in the sense of allowing longer uninterrupted sequences of words. Spanish reveals similar tendency to only a bit lesser extent.
翻译:众所周知的谚语“言语是银,沉默是金”有很长的跨国历史和多种具体含义。在书面文本标注中,实际上可以将其视为其表现形式之一。事实上,有效说话和写字的优点往往决定性地涉及应用适当设置的断层的能力。在本研究报告中,根据大量以七种主要西方语言写成的世界著名和有代表性的文学文本,可以看出几乎所有文本中连续标点标记之间的间隔分布,只能普遍地以离散 Weibull发行的两种参数为特征,这些参数可以用所谓的危险功能来直观解释。这两个参数的优点往往与语言有关,但甚至看起来是用来指导翻译。计算的危险功能的特性表明,在所研究的语言中,英语最不受到限制的,是必须设置连续标点来分隔一个词序列。这可能表明,与其他所研究的语言相比,英语更灵活一些,允许更连续的顺序的西班牙语只是略小一点。