Edge computing enables efficient deep learning inference in resource-constrained environments. In this paper, we propose AMP4EC, an adaptive model partitioning framework that optimizes inference by dynamically partitioning deep learning models based on real-time resource availability. Our approach achieves a latency reduction of up to 78% and a throughput improvement of 414% compared to baseline methods.
翻译:暂无翻译