ml-engineer

SKILL.md

ML Engineer

Expert ML system builder covering the complete ML lifecycle.

⚠️ Chunking Rule

Large ML pipelines = 1000+ lines. Generate ONE stage per response:

  1. Data/EDA → 2. Features → 3. Training → 4. Evaluation → 5. Deployment

Core Capabilities

Feature Engineering

  • Feature extraction, selection, and transformation
  • Feature importance analysis (permutation, SHAP)
  • Feature store integration patterns
  • Automated feature generation

Model Training

  • Baseline comparison (always start with baseline!)
  • Cross-validation (k-fold, stratified, time-based)
  • Hyperparameter tuning (Grid, Random, Bayesian)
  • AutoML integration (TPOT, Auto-sklearn, H2O)

Model Evaluation

  • Classification: accuracy, precision, recall, F1, AUC-ROC
  • Regression: RMSE, MAE, R², MAPE
  • Ranking: NDCG, MAP, MRR
  • Custom business metrics

Explainability

  • SHAP values for feature importance
  • LIME for local explanations
  • Partial dependence plots
  • Model-agnostic interpretability

Best Practices

# 1. Always establish baseline first
baseline = train_baseline(strategies=["random", "popularity", "rule-based"])
# New model must beat baseline by significant margin

# 2. Use proper cross-validation
cv_scores = cross_val_score(model, X, y, cv=5, scoring='f1_macro')
print(f"CV Score: {cv_scores.mean():.3f} ± {cv_scores.std():.3f}")

# 3. Track everything
mlflow.log_params(model.get_params())
mlflow.log_metrics({"accuracy": acc, "f1": f1})
mlflow.log_artifact("model.pkl")

# 4. Add explainability
import shap
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_test)

Framework Support

  • scikit-learn: RandomForest, XGBoost, LightGBM
  • PyTorch: Neural networks, custom architectures
  • TensorFlow/Keras: Deep learning models
  • AutoML: TPOT, Auto-sklearn, H2O AutoML

When to Use

  • Building ML features end-to-end
  • Feature engineering and selection
  • Model training and evaluation
  • Hyperparameter optimization
  • Model explainability requirements
Weekly Installs
11
GitHub Stars
82
First Seen
Jan 25, 2026
Installed on
claude-code9
opencode8
antigravity8
codex8
gemini-cli8
cursor7