The rapidly evolving industry demands high accuracy of the models without the need for time-consuming and computationally expensive experiments required for fine-tuning. Moreover, a model and training pipeline, which was once carefully optimized for a specific dataset, rarely generalizes well to training on a different dataset. This makes it unrealistic to have carefully fine-tuned models for each use case. To solve this, we propose an alternative approach that also forms a backbone of Intel Geti platform: a dataset-agnostic template for object detection trainings, consisting of carefully chosen and pre-trained models together with a robust training pipeline for further training. Our solution works out-of-the-box and provides a strong baseline on a wide range of datasets. It can be used on its own or as a starting point for further fine-tuning for specific use cases when needed. We obtained dataset-agnostic templates by performing parallel training on a corpus of datasets and optimizing the choice of architectures and training tricks with respect to the average results on the whole corpora. We examined a number of architectures, taking into account the performance-accuracy trade-off. Consequently, we propose 3 finalists, VFNet, ATSS, and SSD, that can be deployed on CPU using the OpenVINO toolkit. The source code is available as a part of the OpenVINO Training Extensions (https://github.com/openvinotoolkit/training_extensions}
翻译:快速演变的产业要求模型的高度准确性,不需要为微调所需的耗时和计算成本昂贵的实验。此外,一个模型和培训管道,曾经为特定数据集精心优化过,很少全面适用于不同数据集的培训。这使得为每个使用案例仔细调整模型是不现实的。为了解决这个问题,我们建议了另一种方法,它同时也构成Intel Geti平台的支柱:一个用于物体探测培训的数据集-机密模板,由精心选择和预先培训的模型以及坚实的培训管道组成,供进一步培训之用。我们的解决方案在框外工作,为广泛的数据集提供一个强有力的基线。它可以自行使用,或者作为必要时对具体使用案例进行进一步微调的起点。我们通过对数据集进行平行培训,以及优化整个公司平均结果的架构选择和培训技巧。我们研究了一些建筑,同时考虑到性能-准确性交易/版本。 因此,我们建议使用SATSV/QSDAF版本作为最终的源代码。