In this manuscript, we offer a gentle review of submodularity and supermodularity and their properties. We offer a plethora of submodular definitions; a full description of a number of example submodular functions and their generalizations; example discrete constraints; a discussion of basic algorithms for maximization, minimization, and other operations; a brief overview of continuous submodular extensions; and some historical applications. We then turn to how submodularity is useful in machine learning and artificial intelligence. This includes summarization, and we offer a complete account of the differences between and commonalities amongst sketching, coresets, extractive and abstractive summarization in NLP, data distillation and condensation, and data subset selection and feature selection. We discuss a variety of ways to produce a submodular function useful for machine learning, including heuristic hand-crafting, learning or approximately learning a submodular function or aspects thereof, and some advantages of the use of a submodular function as a coreset producer. We discuss submodular combinatorial information functions, and how submodularity is useful for clustering, data partitioning, parallel machine learning, active and semi-supervised learning, probabilistic modeling, and structured norms and loss functions.
翻译:在此手稿中,我们温和地审查亚模式和超模式及其特性。我们提供了大量亚模式定义;全面描述若干例子模式功能及其概括性;例离散制约;讨论用于最大化、最小化和其他操作的基本算法;对连续亚模式扩展的简要概述;以及一些历史应用。然后我们转而研究亚模式如何在机器学习和人工智能中有用。这包括概括,我们全面说明以下模式性功能之间的差异和共性:国家实验室的素描、核心数据集、采掘和抽象合成功能、数据蒸馏和凝固以及数据子集选择和特征选择。我们讨论了产生一种可用于机器学习的亚模式性功能的各种方法,包括超自然手工艺、学习或大约学习亚模式功能或其中的某些方面,以及使用子模式功能作为核心集成制作者的一些好处。我们讨论了亚模式性组合信息功能,以及亚型组合性组合和抽象组合性、结构化数据分类和结构化损失规范、平行学习机器学习模式、结构化和结构化损失规范等。