Deep learning (DL) is revolutionizing the scientific computing community. To reduce the data gap caused by usually expensive simulations or experimentation, active learning has been identified as a promising solution for the scientific computing community. However, the deep active learning (DAL) literature is currently dominated by image classification problems and pool-based methods, which are not directly transferrable to scientific computing problems, dominated by regression problems with no pre-defined 'pool' of unlabeled data. Here for the first time, we investigate the robustness of DAL methods for scientific computing problems using ten state-of-the-art DAL methods and eight benchmark problems. We show that, to our surprise, the majority of the DAL methods are not robust even compared to random sampling when the ideal pool size is unknown. We further analyze the effectiveness and robustness of DAL methods and suggest that diversity is necessary for a robust DAL for scientific computing problems.
翻译:深层学习( DL) 正在使科学计算界发生革命。 为了缩小通常昂贵的模拟或实验造成的数据差距,积极学习被确定为科学计算界的一个有希望的解决方案。然而,深层积极学习(DAL)文献目前以图像分类问题和基于集合的方法为主,这些问题不能直接转移到科学计算问题,而以回归问题为主,没有预先定义的“集合”无标签数据为主。我们在这里首次利用十种最先进的DAL方法和八种基准问题,调查DAL科学计算问题方法的稳健性。我们惊讶地发现,大多数DAL方法在理想集合规模未知时甚至与随机抽样相比,都不够健全。我们进一步分析DAL方法的有效性和稳健性,并表明,对于科学计算问题来说,强大的DAL方法需要多样性。