Tiny machine learning (TinyML) for low-power devices lacks robust datasets for development. We present Wake Vision, a large-scale dataset for person detection that contains over 6 million quality-filtered images. We provide two variants: Wake Vision (Large) and Wake Vision (Quality), leveraging the large variant for pretraining and knowledge distillation, while the higher-quality labels drive final model performance. The manually labeled validation and test sets reduce error rates from 7.8% to 2.2% compared to previous standards. In addition, we introduce five detailed benchmark sets to evaluate model performance in real-world scenarios, including varying lighting, camera distances, and demographic characteristics. Training with Wake Vision improves accuracy by 1.93% over existing datasets, demonstrating the importance of dataset quality for low-capacity models and dataset size for high-capacity models. The dataset, benchmarks, code, and models are available under the CC-BY 4.0 license, maintained by the Edge AI Foundation.
翻译:暂无翻译