npu-smi
npu-smi Command Reference
Quick reference for Huawei Ascend NPU device management commands.
Validated on real host: Ascend 910B3 server (
npu-smisoftware version25.5.1) on 2026-03-27. Command availability and output fields can differ by platform / firmware.
Quick Start
npu-smi info -l # List all devices
npu-smi info -t health -i 0 # Check device health
npu-smi info -t temp -i 0 -c 0 # Check temperature
npu-smi info -t power -i 0 -c 0 # Check power
npu-smi info -t memory -i 0 -c 0 # Check memory
Device Queries
Basic Information
npu-smi info -l # List devices
npu-smi info -t health -i <id> # Health status (OK/Warning/Error)
npu-smi info -t board -i <id> # Board details (firmware, software version)
npu-smi info -m # List card/chip mapping and chip names
Real-time Metrics
npu-smi info -t temp -i <id> -c <chip> # Temperature (NPU, AI Core)
npu-smi info -t power -i <id> -c <chip> # Power usage and limit
npu-smi info -t memory -i <id> -c <chip> # Memory usage, total, rate
Advanced Queries
npu-smi info -t ecc -i <id> -c <chip> # ECC errors and mode
npu-smi info -t usages -i <id> -c <chip> # Utilization (AI Core, memory, bandwidth)
npu-smi info -t proc-mem -i <id> -c <chip> # Per-process memory usage
npu-smi info -t pcie-err -i <id> -c <chip> # PCIe error counters
npu-smi info -t topo -i <id> -c <chip> # Inter-NPU topology / affinity
npu-smi info -t p2p-enable -i <id> -c <chip> # P2P capability (if supported)
npu-smi info -t product -i <id> -c <chip> # Product name and serial
Note:
npu-smi info -t npuwas not accepted on the validated host; usenpu-smi info -mfor chip mapping andhealth/temp/power/usages/memoryfor per-chip runtime data.
See: references/device-queries.md for output formats, examples, monitoring scripts, and platform identification (A2 vs A3).
Configuration
Temperature and Power Thresholds
npu-smi set -h # Show supported configurable types on this host
npu-smi set -t pwm-mode -d <0|1> # Fan control mode
npu-smi set -t pwm-duty-ratio -d <0-100> # Fan duty ratio
Mode Configuration
npu-smi set -t ecc-enable -i <id> -c <chip> -d <0|1> # ECC enable switch
npu-smi set -t p2p-mem-cfg -i <id> -c <chip> -d <0|1> # P2P memory configuration
npu-smi set -t vnpu-mode -d <0|1> # vNPU mode
npu-smi set -t cpu-freq-up -i <id> -d <0|1> # CPU frequency profile
Fan Control
npu-smi set -t pwm-mode -d <0|1> # 0=Manual, 1=Automatic
npu-smi set -t pwm-duty-ratio -d <0-100> # Fan speed (percent)
System Settings
npu-smi set -t mac-addr -i <id> -c <chip> -d <mac_id> -s "XX:XX:XX:XX:XX:XX"
npu-smi set -t boot-select -i <id> -c <chip> -d <3|4> # 3=M.2 SSD, 4=eMMC
npu-smi set -t cpu-freq-up -i <id> -d <0|1> # 0=1.9GHz/800MHz, 1=1.0GHz/800MHz
npu-smi set -t sys-log-enable -d <0|1> # System logging
Clear Commands
npu-smi clear -t ecc-info -i <id> -c <chip> # Clear ECC errors
npu-smi clear -t tls-cert-period -i <id> -c <chip> # Restore cert threshold
See: references/configuration.md for parameter tables and examples.
Firmware Management
Upgrade Workflow
Query → Upgrade → Check Status → Activate → Restart
npu-smi upgrade -b <item> -i <id> # Query current version
npu-smi upgrade -t <item> -i <id> -f <file.hpm> # Upload firmware
npu-smi upgrade -q <item> -i <id> # Check upgrade status
npu-smi upgrade -a <item> -i <id> # Activate firmware
Components and Restart Requirements
| Component | Item Name | Restart Required |
|---|---|---|
| MCU | mcu |
Yes (restart) |
| Bootloader | bootloader |
Yes (restart) |
| VRD | vrd |
Yes (power cycle 30s) |
See: references/firmware-upgrade.md for complete procedures.
Virtualization (vNPU)
Queries
npu-smi info -t vnpu-mode # Query AVI mode (0=Container, 1=VM)
npu-smi info -t template-info # List all templates
npu-smi info -t template-info -i <id> # Templates for specific device
npu-smi info -t info-vnpu -i <id> -c <chip> # View vNPU info
Management
npu-smi set -t vnpu-mode -d <0|1> # Set AVI mode
npu-smi set -t create-vnpu -i <id> -c <chip> -f <template> [-v <vnpu_id>] [-g <vgroup_id>]
npu-smi set -t destroy-vnpu -i <id> -c <chip> -v <vnpu_id>
vNPU ID Range: [phy_id*16+100, phy_id*16+115]
See: references/virtualization.md for vNPU creation and management.
Certificate Management
Queries
npu-smi info -t tls-csr-get -i <id> -c <chip> # Generate CSR (PEM format)
npu-smi info -t tls-cert -i <id> -c <chip> # View certificate details
npu-smi info -t tls-cert-period -i <id> -c <chip> # Check expiration threshold
npu-smi info -t rootkey -i <id> -c <chip> # Rootkey status
Management
npu-smi set -t tls-cert -i <id> -c <chip> -f "<tls.pem> <ca.pem> <subca.pem>"
npu-smi set -t tls-cert-period -i <id> -c <chip> -s <days> # Set threshold (7-180 days)
npu-smi clear -t tls-cert-period -i <id> -c <chip> # Restore default (90 days)
See: references/certificate-management.md for certificate lifecycle management.
Parameters Reference
| Parameter | Description | How to Get |
|---|---|---|
id |
Device ID (NPU ID) | npu-smi info -l |
chip_id |
Runtime chip ID | npu-smi info -m (usually 0 for Ascend chip, 1 for MCU on validated host) |
vnpu_id |
vNPU ID | Auto-assigned or specified in range |
mac_id |
MAC interface | 0=eth0, 1=eth1, 2=eth2, 3=eth3 |
Supported Platforms
- Atlas 200I DK A2 Developer Kit
- Atlas 500 A2 Smart Station
- Atlas 200I A2 Acceleration Module (RC/EP scenarios)
- Atlas A2/A3 Training Series
- Atlas Training Series
Note: Chip name (e.g., 910B3) does not indicate server platform (A2 vs A3). Use
dmidecode -t system | grep Productornpu-smi info -t productto identify the server model. See references/device-queries.md for details.
Important Notes
- Most configuration commands require root permissions
- Device IDs from
npu-smi info -l - Chip IDs from
npu-smi info -m - MCU/bootloader upgrades require restart after activation
- VRD upgrades require power cycle (30+ seconds off)
- MAC/boot changes require restart
- Command availability varies by hardware platform
npu-smi info procwas not supported on the validated 910B3 host;npu-smi info -t proc-memworkednpu-smi info -t productmay returnThis device does not support querying product.on some server SKUs
Scripts
- scripts/npu-health-check.sh - Comprehensive device health check
Official Documentation
More from ascend-ai-coding/awesome-ascend-skills
atc-model-converter
Complete toolkit for Huawei Ascend NPU model conversion and end-to-end inference adaptation. Workflow 1 auto-discovers input shapes and parameters from user source code. Workflow 2 exports PyTorch models to ONNX. Workflow 3 converts ONNX to .om via ATC with multi-CANN version support. Workflow 4 adapts the user's full inference pipeline (preprocessing + model + postprocessing) to run end-to-end on NPU. Workflow 5 verifies precision between ONNX and OM outputs. Workflow 6 generates a reproducible README. Supports any standard PyTorch/ONNX model. Use when converting, testing, or deploying models on Ascend AI processors.
55hccl-test
HCCL (Huawei Collective Communication Library) performance testing for Ascend NPU clusters. Use for testing distributed communication bandwidth, verifying HCCL functionality, and benchmarking collective operations like AllReduce, AllGather. Covers MPI installation, multi-node pre-flight checks (SSH/CANN version/NPU health), and production testing workflows.
54ascendc
AscendC transformer/GMM/MoE 算子与 Matmul/Cube Kernel 的统一开发规范。用于在 ops-transformer 下新增或修改 op_host、tiling/infershape、op_kernel(含 MatmulImpl/Cube 调用)、以及对应的 CANN aclnn 示例和单测。
51ascend-docker
Create Docker containers for Huawei Ascend NPU development with proper device mappings and volume mounts. Use when setting up Ascend development environments in Docker, running CANN applications in containers, or creating isolated NPU development workspaces. Supports privileged mode (default), basic mode, and full mode with profiling/logging. Auto-detects available NPU devices.
51msmodelslim
Huawei Ascend NPU model compression tool (msModelSlim). Use for LLM quantization (W4A8, W8A8, W8A8S, W8A16), MoE model compression, multimodal model compression (Qwen-VL, InternVL, HunyuanVideo, FLUX, SD3), calibration data preparation, precision auto-tuning, sensitive layer analysis, custom model integration, and deployment in MindIE/vLLM-Ascend. Supports Qwen, LLaMA, DeepSeek, GLM, Kimi, InternLM and more.
44vllm-ascend
vLLM Ascend plugin for LLM inference serving on Huawei Ascend NPU. Use for offline batch inference, API server deployment, quantization inference (with msmodelslim quantized models), tensor/pipeline parallelism for distributed serving, and OpenAI-compatible API endpoints. Supports Qwen, DeepSeek, GLM, LLaMA models with Ascend-optimized kernels.
41