An ML-based system for interactive labeling of image datasets is contributed in TensorBoard Projector to speed up image annotation performed by humans. The tool visualizes feature spaces and makes it directly editable by online integration of applied labels, and it is a system for verifying and managing machine learning data pertaining to labels. We propose realistic annotation emulation to evaluate the system design of interactive active learning, based on our improved semi-supervised extension of t-SNE dimensionality reduction. Our active learning tool can significantly increase labeling efficiency compared to uncertainty sampling, and we show that less than 100 labeling actions are typically sufficient for good classification on a variety of specialized image datasets. Our contribution is unique given that it needs to perform dimensionality reduction, feature space visualization and editing, interactive label propagation, low-complexity active learning, human perceptual modeling, annotation emulation and unsupervised feature extraction for specialized datasets in a production-quality implementation.
翻译:在TensorBoard Projector中,为图像数据集的互动标签提供基于 ML 的交互式标签系统,以加快人类对图像的描述。该工具可视化功能空间,并通过应用标签的在线整合直接编辑它。它是一个核查和管理与标签有关的机器学习数据的系统。我们提出现实的批注模拟,以评价互动式积极学习的系统设计,其依据是我们改进了的半监督的T-SNE维度减少扩展。我们的积极学习工具可以大大提高标签效率,而不是不确定性抽样,并且我们表明,只有不到100个标签动作通常足以对各种专门图像数据集进行良好的分类。我们的贡献是独特的,因为它需要进行维度减少、地貌空间可视化和编辑、交互式标签传播、低复杂性积极学习、人类感知性建模、说明性模拟和未受监督的特征提取,以便在生产质量的实施中用于专门数据集。