We introduce VOCALExplore, a system designed to support users in building domain-specific models over video datasets. VOCALExplore supports interactive labeling sessions and trains models using user-supplied labels. VOCALExplore maximizes model quality by automatically deciding how to select samples based on observed skew in the collected labels. It also selects the optimal video representations to use when training models by casting feature selection as a rising bandit problem. Finally, VOCALExplore implements optimizations to achieve low latency without sacrificing model performance. We demonstrate that VOCALExplore achieves close to the best possible model quality given candidate acquisition functions and feature extractors, and it does so with low visible latency (~1 second per iteration) and no expensive preprocessing.
翻译:我们引入了VOCALExplore(VOCALExplore)系统,该系统旨在支持用户在视频数据集上建立特定域模型。 VOCALExplore支持交互式标签会议,并利用用户提供的标签培训模型。 VOCALExplore通过自动决定如何根据所收集标签中观察到的斜线选择样本,最大限度地提高模型质量。VOCALExplore还选择了最佳视频演示,用于培训模型时,将特征选择作为一个不断上升的土匪问题。最后,VOCALExplore实施优化,以在不牺牲模型性能的情况下实现低延度。我们证明VOCALExplore由于候选人的获取功能和地物提取器而接近于尽可能最佳的模型质量,并且它使用低可见的悬浮线(每升1秒)和没有昂贵的预处理。</s>