We propose a new approach -- called PK-clustering -- to help social scientists create meaningful clusters in social networks. Many clustering algorithms exist but most social scientists find them difficult to understand, and tools do not provide any guidance to choose algorithms, or to evaluate results taking into account the prior knowledge of the scientists. Our work introduces a new clustering approach and a visual analytics user interface that address this issue. It is based on a process that 1) captures the prior knowledge of the scientists as a set of incomplete clusters, 2) runs multiple clustering algorithms (similarly to clustering ensemble methods), 3) visualizes the results of all the algorithms ranked and summarized by how well each algorithm matches the prior knowledge, 4) evaluates the consensus between user-selected algorithms, and 5) allows users to review details and iteratively update the acquired knowledge. We describe our approach using an initial functional prototype, then provide two examples of use and early feedback from social scientists. We believe our clustering approach offers a novel constructive method to iteratively build knowledge while avoiding being overly influenced by the results of often randomly selected black-box clustering algorithms.
翻译:我们提出了一个新的方法 -- -- 称为PK集群 -- -- 以帮助社会科学家在社会网络中创造有意义的集群。许多集群算法存在,但大多数社会科学家发现难以理解,工具没有提供任何指南来选择算法或根据科学家先前的知识来评估结果。我们的工作引入了新的集群法和视觉分析用户界面来解决这个问题。它基于一个进程,即1)将科学家的先前知识作为一组不完全的集群加以捕捉,2)运行多种集群算法(类似于组合组合组合组合共同方法),3)将所有排序和汇总的算法的结果都可视化,而每种算法如何与先前的知识相匹配,4)评估用户选择的算法之间的共识,5)使用户能够审查细节并迭代更新获得的知识。我们用初始功能原型描述我们的方法,然后提供社会科学家使用和早期反馈的两个实例。我们认为,我们的集群法提供了一种新的迭代积累知识的建设性方法,同时避免过分受经常随机选择的黑盒组合算法的结果的影响。