The rapid advancement of large language models (LLMs) has enabled significant strides in various fields. This paper introduces a novel approach to evaluate the effectiveness of LLM embeddings in the context of inherent geometric properties. We investigate the structural properties of these embeddings through three complementary metrics $δ$-hyperbolicity, Ultrametricity, and Neighbor Joining. $δ$-hyperbolicity, a measure derived from geometric group theory, quantifies how much a metric space deviates from being a tree-like structure. In contrast, ultrametricity characterizes strictly hierarchical structures where distances obey a strong triangle inequality. While Neighbor Joining quantifies how tree-like the distance relationships are, it does so specifically with respect to the tree reconstructed by the Neighbor Joining algorithm. By analyzing the embeddings generated by LLMs using these metrics, we uncover to what extent the embedding space reflects an underlying hierarchical or tree-like organization. Our findings reveal that LLM embeddings exhibit varying degrees of hyperbolicity and ultrametricity, which correlate with their performance in the underlying machine learning tasks.
翻译:大语言模型(LLMs)的快速发展推动了多个领域的显著进步。本文提出了一种新方法,通过内在几何特性来评估LLM嵌入的有效性。我们通过三个互补的度量指标——δ-双曲性、超度量性与邻接法——来研究这些嵌入的结构特性。δ-双曲性源自几何群论,用于量化度量空间偏离树状结构的程度。与之相对,超度量性则刻画了严格的层次结构,其中距离满足强三角不等式。而邻接法通过邻接算法重建的树结构,专门量化距离关系的树状相似程度。通过运用这些指标分析LLM生成的嵌入,我们揭示了嵌入空间在多大程度上反映了底层的层次化或树状组织。我们的研究结果表明,LLM嵌入表现出不同程度的双曲性与超度量性,这些特性与其在相应机器学习任务中的性能表现具有相关性。