A major challenge when using k-means clustering often is how to choose the parameter k, the number of clusters. In this letter, we want to point out that it is very easy to draw poor conclusions from a common heuristic, the "elbow method". Better alternatives have been known in literature for a long time, and we want to draw attention to some of these easy to use options, that often perform better. This letter is a call to stop using the elbow method altogether, because it severely lacks theoretic support, and we want to encourage educators to discuss the problems of the method -- if introducing it in class at all -- and teach alternatives instead, while researchers and reviewers should reject conclusions drawn from the elbow method.
翻译:当使用 k means 群集时, 如何选择 参数 k, 组群数量 。 在此信中, 我们想指出, 从共同的超值方法“ elbow 方法” 中得出错误的结论非常容易。 文献中早已知道更好的替代方法。 我们想提醒人们注意这些容易使用、 通常效果较好的选项中的一些。 这封信呼吁停止使用肘法, 因为它严重缺乏理论支持, 我们希望鼓励教育者讨论方法的问题 -- -- 如果在课堂上采用这种方法的话 -- -- 并教授替代方法, 而研究人员和审查者应该拒绝从肘法中得出的结论。