This paper proposes an active learning algorithm for solving regression and classification problems based on inverse-distance weighting functions for selecting the feature vectors to query. The algorithm has the following features: (i) supports both pool-based and population-based sampling; (ii) is independent of the type of predictor used; (iii) can handle known and unknown constraints on the queryable feature vectors; and (iv) can run either sequentially, or in batch mode, depending on how often the predictor is retrained. The method's potential is shown in numerical tests on illustrative synthetic problems and real-world regression and classification datasets from the UCI repository. A Python implementation of the algorithm that we call IDEAL (Inverse-Distance based Exploration for Active Learning), is available at \url{http://cse.lab.imtlucca.it/~bemporad/ideal}.
翻译:本文提出一种积极的学习算法,以解决基于反距离加权功能的回归和分类问题,用于选择要查询的特性矢量。算法具有以下特点:(一) 支持以池为基础的抽样和以人口为基础的抽样;(二) 独立于所用预测器的类型;(三) 能够处理对可查询特性矢量的已知和未知限制;(四) 可以依次运行,或以批量方式运行,视预测器接受再培训的频率而定。该方法的潜力表现在UCI 存储处对说明性合成问题和真实世界回归和分类数据集的数值测试中。我们称之为DEPAL(基于反差异探索积极学习)的算法的Python实施,可在\url{http://cse.lab.imlucca.it/~bemporad/ideal}查阅。