Active learning (AL) is a machine learning algorithm that can achieve greater accuracy with fewer labeled training instances, for having the ability to ask oracles to label the most valuable unlabeled data chosen iteratively and heuristically by query strategies. Scientific experiments nowadays, though becoming increasingly automated, are still suffering from human involvement in the designing process and the exhaustive search in the experimental space. This article performs a retrospective study on a drug response dataset using the proposed AL scheme comprised of the matrix factorization method of alternating least square (ALS) and deep neural networks (DNN). This article also proposes an AL query strategy based on expected loss minimization. As a result, the retrospective study demonstrates that scientific experimental design, instead of being manually set, can be optimized by AL, and the proposed query strategy ELM sampling shows better experimental performance than other ones such as random sampling and uncertainty sampling.
翻译:主动学习(AL)是一种机器学习算法,它能以较少的标签培训实例达到更高的准确性,因为能够要求神职人员用查询策略来标出最有价值的无标签数据,以迭代和超常方式通过查询策略来选择。科学实验虽然日益自动化,但如今仍然受到人类参与设计过程和在实验空间进行彻底搜索的影响。本文章利用拟议的AL方案,对药物反应数据集进行回溯性研究,其中包括最平方交替(ALS)和深神经网络(DNN)的矩阵化乘数法。本文章还提出了基于预期损失最小化的AL查询策略。结果,回顾研究表明,科学实验设计可以由AL进行优化,而不是人工设置,拟议的查询战略ELM抽样显示比随机抽样和不确定抽样等其他方法的实验性效果更好。