String matching is the problem of finding all the occurrences of a pattern in a text. We propose improved versions of the fast family of string matching algorithms based on hashing $q$-grams. The improvement consists of considering minimal values $q$ such that each $q$-grams of the pattern has a unique hash value. The new algorithms are fastest than algorithm of the HASH family for short patterns on large size alphabets.
翻译:字符串匹配是一个在文本中查找模式出现的所有情况的问题。 我们建议改进基于 shashing $q$- grams 的字符串匹配算法快速组合的版本。 改进包括考虑最低值$q$, 这样模式的每克美元- grams 都有一个独特的散列值。 新的算法比 HASH 家族用于大尺寸字母短型算法最快 。</s>