megatron-memory-estimator
Megatron Memory Estimator
Estimate GPU memory usage for Megatron-based models directly from HuggingFace configs or custom specifications.
Quick Start
Option 1: From HuggingFace Model (Recommended)
Estimate directly from HuggingFace model paths:
# DeepSeek-V3 (61 layers, requires layer distribution when pp>1)
python scripts/estimate_from_hf.py deepseek-ai/DeepSeek-V3 \
--tp 4 --pp 4 --ep 8 --num-gpus 128 --num-layers-in-last-pipeline-stage 16
# Qwen 3
python scripts/estimate_from_hf.py Qwen/Qwen3-235B-A22B \
--tp 8 --pp 4 --ep 4 --num-gpus 128
More from yzlnew/infra-skills
tikz-flowchart
Creates professional TikZ flowcharts with standardized themes, including Google Material-like and Anthropic-inspired options.
113tilelang-developer
Write, optimize, and debug high-performance AI compute kernels using TileLang (a Python DSL for GPU programming). Use when the user requests: (1) Writing custom GPU kernels for AI workloads (GEMM, Attention, MLA, etc.), (2) Optimizing existing TileLang code for NVIDIA, AMD, or Ascend hardware, (3) Implementing non-standard operators (like DeepSeek MLA, FlashAttention variants), (4) Debugging TileLang compilation or runtime errors, or (5) Cross-platform kernel development targeting multiple GPU vendors.
13slime-user
Guide for using SLIME (LLM post-training framework for RL Scaling). Use when working with SLIME for reinforcement learning training of language models, including setup, configuration, training execution, multi-turn interactions, custom reward models, tool calling scenarios, or troubleshooting SLIME workflows. Covers GRPO, GSPO, PPO, Reinforce++, multi-agent RL, VLM training, FSDP/Megatron backends, SGLang integration, dynamic sampling, and custom generation functions.
8material-you-slides
Create presentation slides using Material You (Material Design 3) style. Generates 1280x720 HTML slides with M3 color tokens, Roboto typography, rounded cards, flow diagrams, metric cards, code blocks, and structured layouts. Use when the user asks to create slides, presentations, or decks and wants a clean, modern Material Design 3 aesthetic.
3