基于排名的链接方法I: 三元组比较和有向单复形 (Rank-based linkage I: triplet comparisons and oriented simplicial complexes)

Rank-based linkage is a new tool for summarizing a collection $S$ of objects according to their relationships. These objects are not mapped to vectors, and ``similarity'' between objects need be neither numerical nor symmetrical. All an object needs to do is rank nearby objects by similarity to itself, using a Comparator which is transitive, but need not be consistent with any metric on the whole set. Call this a ranking system on $S$. Rank-based linkage is applied to the $K$-nearest neighbor digraph derived from a ranking system. Computations occur on a 2-dimensional abstract oriented simplicial complex whose faces are among the points, edges, and triangles of the line graph of the undirected $K$-nearest neighbor graph on $S$. In $|S| K^2$ steps it builds an edge-weighted linkage graph $(S, \mathcal{L}, \sigma)$ where $\sigma(\{x, y\})$ is called the in-sway between objects $x$ and $y$. Take $\mathcal{L}_t$ to be the links whose in-sway is at least $t$, and partition $S$ into components of the graph $(S, \mathcal{L}_t)$, for varying $t$. Rank-based linkage is a functor from a category of out-ordered digraphs to a category of partitioned sets, with the practical consequence that augmenting the set of objects in a rank-respectful way gives a fresh clustering which does not ``rip apart`` the previous one. The same holds for single linkage clustering in the metric space context, but not for typical optimization-based methods. Open combinatorial problems are presented in the last section.

翻译：基于排名的链接是一个新工具，用于根据对象之间的关系总结一个集合$S$。这些对象没有被映射到向量，而且“相似度”不需要是数值或对称的。对象只需要使用Comparator将附近的对象按相似度排序，这是一个可传递但不必与整个集合的任何度量一致的排名系统。称这个排名系统为$S$上的排名。将基于排名的链接应用于从排名系统派生的K最近邻图。计算发生在一个二维抽象的有向单复形上，其面在$S$的无向K最近邻图的点、边和三角形之间。在$|S| K^2$步骤中，它构建了一个边加权联结图$(S, \mathcal{L}, \sigma)$，其中$\sigma(\{x,y\})$称为对象$x$和$y$之间的内偏移量。取$\mathcal{L}_t$为内偏移至少为$t$的链接，并将$S$划分为图$(S, \mathcal{L}_t)$的组件，对不同的$t$进行划分。基于排名的链接是从有向图范畴到分区集范畴的函子，其实际后果是以排名尊重的方式增加对象集，这给出了一个新的聚类，不会“撕裂”以前的聚类。最后一章介绍了开放的组合问题。