Defining an efficient training set is one of the most delicate phases for the success of remote sensing image classification routines. The complexity of the problem, the limited temporal and financial resources, as well as the high intraclass variance can make an algorithm fail if it is trained with a suboptimal dataset. Active learning aims at building efficient training sets by iteratively improving the model performance through sampling. A user-defined heuristic ranks the unlabeled pixels according to a function of the uncertainty of their class membership and then the user is asked to provide labels for the most uncertain pixels. This paper reviews and tests the main families of active learning algorithms: committee, large margin and posterior probability-based. For each of them, the most recent advances in the remote sensing community are discussed and some heuristics are detailed and tested. Several challenging remote sensing scenarios are considered, including very high spatial resolution and hyperspectral image classification. Finally, guidelines for choosing the good architecture are provided for new and/or unexperienced user.
翻译:问题的复杂性、时间和财政资源有限,以及阶级内部差异很大,如果以亚最佳数据集进行训练,算法就会失败。积极学习的目的是通过抽样反复改进模型性能,从而建立高效的成套培训。用户定义的湿度将无标签的像素按其类别成员不确定性的函数排序,然后要求用户为最不确定的像素提供标签。本文审查和测试了主动学习算法的主要组别:委员会、大边距和后方概率。每个组别都讨论了遥感社区的最新进展,并详细和测试了某些超自然学。考虑了一些具有挑战性的遥感假设,包括甚高空间分辨率和超光谱图像分类。最后,为新的和/或没有经验的用户提供了选择良好结构的指导方针。