Normalized web distance (NWD) is a similarity or normalized semantic distance based on the World Wide Web or any other large electronic database, for instance Wikipedia, and a search engine that returns reliable aggregate page counts. For sets of search terms the NWD gives a similarity on a scale from 0 (identical) to 1 (completely different). The NWD approximates the similarity according to all (upper semi)computable properties. We develop the theory and give applications. The derivation of the NWD method is based on Kolmogorov complexity.
翻译:普通化的网络距离(NWD)是基于万维网或任何其他大型电子数据库(例如Wikipedia)的类似或正常的语义距离,以及一个返回可靠总页数的搜索引擎。对于成套搜索术语,NWD给出的相似程度从0(相同)到1(完全不同)不等。NWD与所有(半半)可计算属性相近。我们开发了理论并给出了应用。NWD方法的衍生基于 Kolmogorov 的复杂程度。