Recently, we have witnessed the rise of novel ``event-based'' camera sensors for high-speed, low-power video capture. Rather than recording discrete image frames, these sensors output asynchronous ``event'' tuples with microsecond precision, only when the brightness change of a given pixel exceeds a certain threshold. Although these sensors have enabled compelling new computer vision applications, these applications often require expensive, power-hungry GPU systems, rendering them incompatible for deployment on the low-power devices for which event cameras are optimized. Whereas receiver-driven rate adaptation is a crucial feature of modern video streaming solutions, this topic is underexplored in the realm of event-based vision systems. On a real-world event camera dataset, we first demonstrate that a state-of-the-art object detection application is resilient to dramatic data loss, and that this loss may be weighted towards the end of each temporal window. We then propose a scalable streaming method for event-based data based on Media Over QUIC, prioritizing object detection performance and low latency. The application server can receive complementary event data across several streams simultaneously, and drop streams as needed to maintain a certain latency. With a latency target of 5 ms for end-to-end transmission across a small network, we observe an average reduction in detection mAP as low as 0.36. With a more relaxed latency target of 50 ms, we observe an average mAP reduction as low as 0.19.
翻译:暂无翻译