skillcraft

Installation
SKILL.md

SkillCraft - LLM Agent Tool Composition Benchmark

Description

Evaluate and analyze LLM agents' ability to form, abstract, and reuse higher-level tool compositions (Skills). Use this skill when researching agent skill discovery, tool composition patterns, or evaluating skill caching efficiency.

触发词: SkillCraft, skill discovery, tool composition, agent skills, skill caching, LLM benchmark, 技能发现, 工具组合

Core Concepts

Problem Statement

  • Traditional benchmarks test "can the agent call the right tool?"
  • SkillCraft tests "can the agent abstract and reuse tool combinations?"
  • This is the difference between tool usage and tool mastery

Dual Difficulty Dimensions

  1. Quantitative Scaling - Increase number of entities/items to process
  2. Structural Scaling - Compose subtasks into longer, more complex tool chains
Related skills

More from dwsy/agent

Installs
1
Repository
dwsy/agent
GitHub Stars
12
First Seen
Apr 10, 2026