The use of remote sensing in humanitarian crisis response missions is well-established and has proven relevant repeatedly. One of the problems is obtaining gold annotations as it is costly and time consuming which makes it almost impossible to fine-tune models to new regions affected by the crisis. Where time is critical, resources are limited and environment is constantly changing, models has to evolve and provide flexible ways to adapt to a new situation. The question that we want to answer is if prioritization of samples provide better results in fine-tuning vs other classical sampling methods under annotated data scarcity? We propose a method to guide data collection during fine-tuning, based on estimated model and sample properties, like predicted IOU score. We propose two formulas for calculating sample priority. Our approach blends techniques from interpretability, representation learning and active learning. We have applied our method to a deep learning model for semantic segmentation, U-Net, in a remote sensing application of building detection - one of the core use cases of remote sensing in humanitarian applications. Preliminary results shows utility in prioritization of samples for tuning semantic segmentation models under scarcity of data condition.
翻译:在人道主义危机反应任务中使用遥感是早已确立而且一再证明相关的。问题之一是获得黄金说明,因为其费用昂贵和耗时,因此几乎不可能将模型微调到受危机影响的新区域。在时间紧迫、资源有限和环境不断变化的情况下,模型必须演变并提供适应新形势的灵活方法。我们希望回答的问题是,在附加说明的数据匮乏的情况下,样本的优先排序是否在微调和其他传统抽样方法方面提供了更好的结果?我们建议了一种在微调时指导数据收集的方法,其依据是估计的模型和样本特性,如IOU的预测。我们提出了计算抽样优先的两种公式。我们的方法从可解释性、代表性学习和积极学习的混合技术。我们运用了我们的方法,在建筑探测的遥感应用中,即遥感应用于人道主义应用中的核心应用案例之一。初步结果显示,在数据匮乏的情况下,在调整语系分模型时,对样本的优先排序是有用的。