The goal of this thesis is to study the use of the Kantorovich-Rubinstein distance as to build a descriptor of sample complexity in classification problems. The idea is to use the fact that the Kantorovich-Rubinstein distance is a metric in the space of measures that also takes into account the geometry and topology of the underlying metric space. We associate to each class of points a measure and thus study the geometrical information that we can obtain from the Kantorovich-Rubinstein distance between those measures. We show that a large Kantorovich-Rubinstein distance between those measures allows to conclude that there exists a 1-Lipschitz classifier that classifies well the classes of points. We also discuss the limitation of the Kantorovich-Rubinstein distance as a descriptor.
翻译:暂无翻译