NPU论文 - 专知

会员服务 ·

NPU

TZ-LLM: Protecting On-Device Large Language Models with Arm TrustZone

Arxiv

0+阅读 · 11月17日

Edge Deployment of Small Language Models, a comprehensive comparison of CPU, GPU and NPU backends

Arxiv

0+阅读 · 11月27日

Edge Deployment of Small Language Models, a comprehensive comparison of CPU, GPU and NPU backends

Arxiv

0+阅读 · 12月9日

Context-Driven Performance Modeling for Causal Inference Operators on Neural Processing Units

Arxiv

0+阅读 · 12月17日

Benchmarking Ultra-Low-Power $μ$NPUs

Benchmarking Ultra-Low-Power $μ$NPUs

Arxiv

0+阅读 · 10月31日

From Principles to Practice: A Systematic Study of LLM Serving on Multi-core NPUs

Arxiv

0+阅读 · 10月7日

ReGate: Enabling Power Gating in Neural Processing Units

Arxiv

0+阅读 · 10月3日

Scaling LLM Test-Time Compute with Mobile NPU on Smartphones

Arxiv

0+阅读 · 9月27日

Benchmarking Ultra-Low-Power $μ$NPUs

Benchmarking Ultra-Low-Power $μ$NPUs

Arxiv

0+阅读 · 3月28日

NVR: Vector Runahead on NPUs for Sparse Memory Access

Arxiv

0+阅读 · 3月17日

NVR: Vector Runahead on NPUs for Sparse Memory Access

Arxiv

0+阅读 · 2月19日

ActNAS : Generating Efficient YOLO Models using Activation NAS

Arxiv

0+阅读 · 2024年11月15日

IANUS: Integrated Accelerator based on NPU-PIM Unified Memory System

Arxiv

0+阅读 · 2024年10月19日

Neural Architecture Search of Hybrid Models for NPU-CIM Heterogeneous AR/VR Devices

Arxiv

0+阅读 · 2024年10月10日

ActNAS : Generating Efficient YOLO Models using Activation NAS

Arxiv

0+阅读 · 2024年10月11日

参考链接

微信扫码咨询专知VIP会员