skills/openclaw/skills/openclaw-guardian

openclaw-guardian

SKILL.md

OpenClaw Guardian

The missing safety layer for AI agents.

Why?

OpenClaw gives agents direct access to shell, files, email, browser, and more. 99% of that is harmless. Guardian catches the 1% that isn't — without slowing down the rest.

How It Works

Tool Call → Blacklist Matcher (regex rules, 0ms)
   No match     → Pass instantly (99% of calls)
   Warning hit  → 1 LLM vote ("did the user ask for this?")
   Critical hit → 3 LLM votes (all must confirm user intent)

Two Blacklist Levels

Level LLM Votes Latency Examples
No match 0 ~0ms Reading files, git, normal ops
Warning 1 ~1-2s rm -rf /tmp/cache, chmod 777, sudo apt
Critical 3 (unanimous) ~2-4s rm -rf ~/, mkfs, dd of=/dev/, shutdown

What Gets Checked

Only three tool types are inspected:

  • exec → command string matched against exec blacklist
  • write / edit → file path canonicalized and matched against path blacklist
  • Everything else passes through instantly

LLM Intent Verification

When a blacklist rule matches, Guardian asks a lightweight LLM: "Did the user explicitly request this?" It reads recent conversation context to prevent false positives.

  • Warning: 1 LLM call. Confirmed → proceed.
  • Critical: 3 parallel LLM calls. All 3 must confirm. Any "no" → block.

Auto-discovers a cheap/fast model from your existing OpenClaw provider config (prefers Haiku). No separate API key needed.

LLM Fallback

  • Critical + LLM down → blocked (fail-safe)
  • Warning + LLM down → asks user for manual confirmation

Blacklist Rules

Critical (exec)

  • rm -rf on system paths (excludes /tmp/ and workspace)
  • mkfs, dd to block devices, redirects to /dev/sd*
  • Writes to /etc/passwd, /etc/shadow, /etc/sudoers
  • shutdown, reboot, disable SSH
  • Bypass: eval, absolute-path rm, interpreter-based (python -c, node -e)
  • Pipe attacks: curl | sh, wget | bash, base64 -d | sh
  • Chain attacks: download + chmod +x + execute

Warning (exec)

  • rm -rf on safe paths, sudo, chmod 777, chown root
  • Package install/remove, service management
  • Crontab mods, SSH/SCP, Docker ops, kill/killall

Path Rules (write/edit)

  • Critical: system auth files, SSH keys, systemd units
  • Warning: dotfiles, /etc/ configs, .env files, authorized_keys

Audit Log

Every blacklist hit logged to ~/.openclaw/guardian-audit.jsonl with SHA-256 hash chain — tamper-evident, each entry covers full content + previous hash.

Installation

openclaw plugins install openclaw-guardian

Or manually:

cd ~/.openclaw/workspace
git clone https://github.com/fatcatMaoFei/openclaw-guardian.git

Token Cost

Scenario % of Ops Extra Cost
No match ~99% 0
Warning ~0.5-1% ~500 tokens
Critical <0.5% ~1500 tokens

Prefers cheap models (Haiku, GPT-4o-mini, Gemini Flash).

File Structure

extensions/guardian/
├── index.ts                # Entry — registers before_tool_call hook
├── src/
│   ├── blacklist.ts        # Two-tier regex rules (critical/warning)
│   ├── llm-voter.ts        # LLM intent verification
│   └── audit-log.ts        # SHA-256 hash-chain audit logger
├── test/
│   └── blacklist.test.ts   # Blacklist rule tests
├── openclaw.plugin.json    # Plugin manifest
└── default-policies.json   # Enable/disable toggle

License

MIT

Weekly Installs
1
Repository
openclaw/skills
GitHub Stars
3.8K
First Seen
Mar 7, 2026
Installed on
amp1
cline1
openclaw1
opencode1
cursor1
kimi-cli1