MCPJam CLI Investigation

Use this skill when analyzing MCP server behavior from mcpjam-cli output. The goal is to separate:

real protocol issues
interoperability warnings
implementation polish
mcpjam or SDK artifacts

Default stance

Treat raw request/response evidence as higher trust than normalized CLI convenience output.
Separate observations, compliance issues, and security findings. They are related, but not interchangeable.
Map claims to spec strength: MUST and MUST NOT are strong conformance signals; SHOULD and RECOMMENDED are softer guidance; MAY and optional fields are usually informational.
Do not label a security finding high unless you can support a concrete attacker benefit or clear breakage path.
When evidence is ambiguous, lower confidence or use pending before overstating the conclusion.

Quick workflow

Start with the narrowest command that actually proves the claim.
If the command may fail, you want a reusable handoff artifact, or CI should retain evidence, add --debug-out <path> to server probe, server validate, tools call, or oauth login.
If the probe shows oauth_required and the task is to inspect the server surface, continue with oauth login or another supported auth flow to obtain reusable credentials before judging post-auth behavior.
After successful auth, inspect the connected surface with direct commands such as server info, server capabilities, tools list, resources list/read/templates, and prompts list/get.
Use server doctor --out <path> when you need one breadth-first snapshot instead of several single-purpose command outputs.
If the output came from server doctor or a --debug-out artifact, split it into primary command evidence, probe evidence, and connected-sweep evidence.
If a field may be CLI-added or SDK-normalized, read references/cli-surface-notes.md before concluding anything.
If the claim depends on MCP semantics, read references/mcp-2025-11-25-interpretation.md.
If the task involves security review, read references/security-best-practices.md for the full checklist and follow the security review workflow below.
Write the result using the output contract below.

Security review workflow

Use this when the task is to assess an MCP server's security posture. All checks use existing CLI commands — no special security tooling is needed. Do not assume every server should require auth.

Phase 1: Observe (read-only)

Run server probe --url <target> --format json first. Add oauth metadata or server doctor --out <path> only when they clarify the picture.

Record an initial auth signal:
- full-auth candidate: probe status is oauth_required
- public-or-mixed candidate: probe status is ready
- unknown: probe is only reachable, error, or otherwise ambiguous
Capture discovery facts:
- OAuth metadata URLs and whether they point to public, private, or suspicious targets
- scopes_supported, WWW-Authenticate, and PKCE methods
- registration strategies such as dcr, cimd, and preregistered
Record the evidence surface you are trusting. Raw probe/RPC evidence beats doctor summaries or convenience fields.
Phase 1 can produce observations and compliance notes. By itself it should not produce a high security severity.

Phase 2: Provoke (behavior, still mostly unauth)

Treat the Phase 1 auth signal as provisional until behavior confirms it.

For a full-auth candidate:
- run DCR shape probes if DCR is supported
- spot-check representative unauth tools list or tools call behavior when feasible
- check malformed, expired, or obviously wrong-audience token handling without overstating what a rejection proves
For a public-or-mixed candidate:
- run unauth tools list
- classify exposed tools as read-only, write, or side-effect
- call representative public tools unauth
- check whether gated tools fail with a clean auth challenge instead of silent empty data or partial data
Anonymous tiers, rate limits, or degraded public access are posture notes, not a separate posture class.
Reclassify to one of no-auth, full-auth, mixed-auth, or unknown once Phase 2 behavior is clear. If Phase 2 contradicts Phase 1, update the posture and rerun the relevant checks instead of forcing the old classification.
Input-validation hits from Phase 2 cap at medium security severity until Phase 3 proves attacker benefit.
Design or posture findings can be real security findings in Phase 2, but do not auto-promote them. Document the unsafe behavior, abuse path, and any owner-intent uncertainty before calling them medium or high.

Phase 3: Exploit or confirm attacker benefit

Use oauth login and the same browser session when the proof depends on consent or cookies.

Use Phase 3 to turn a plausible concern into a real end-to-end security finding:
- DCR plus authorization flow proof
- redirect URI exact-match bypass proof
- foreign-token acceptance or token passthrough proof
- code, token, or cross-tenant data capture
Consent skip is one route to high, not the only route. Any demonstrated chain that shows concrete attacker gain can justify high.

Phase 4: Inventory blast radius

After auth succeeds, decode JWT claims, inspect Mcp-Session-Id with raw logs, and enumerate tools, resources, prompts, scopes, and tenant context.
Phase 4 is mainly blast-radius calibration. Treat it as context unless you also prove abuse.

Security severity calibration

high: demonstrated attacker benefit or conforming-client breakage with direct evidence
medium: credible security issue with a concrete attack scenario, but end-to-end proof is still partial
low: hardening gap or limited-impact security concern
pending: plausible security concern with a specific missing proof step that could materially raise or lower severity
info: true observation with no credible attacker benefit yet

Use pending instead of manufacturing a medium or high security severity from a checklist hit.

Command choice

server probe: HTTP transport reachability, initialize behavior, and OAuth discovery hints.
server doctor: combined triage artifact for probe plus connected behavior. Good for breadth, not always sufficient to prove wire-level behavior by itself.
oauth metadata, oauth proxy, oauth debug-proxy: exact endpoint and metadata inspection when conformance output looks surprising.
oauth login: obtain reusable credentials and verify the authenticated MCP path. Use this when the goal is to inspect a server that requires OAuth, then follow it with connected commands rather than stopping at the login result.
oauth conformance, oauth conformance-suite: flow-level auth checks. Treat these as targeted probes, not a complete security review.
server info, server capabilities, server validate, server ping, server export: connected behavior after initialization and auth.
tools list and tools call, resources list/read/templates, prompts list/get/list-multi: direct post-connect capability checks.
Prefer --format json. Add --rpc when available if you need request and response evidence rather than a summary. Add --debug-out when you need a failure-safe artifact, not as a replacement for raw evidence.

Output contract

General triage output

For non-security tasks, return:

Verdict: real issue, interop warning, implementation polish, or scanner/client artifact
Severity: high, medium, low, or info
Confidence: high, medium, or low
Why it matters: one short paragraph tied to interoperability, security, or user impact
Evidence: the exact CLI behavior that supports the claim
Missing evidence: what would need to be confirmed before raising severity or confidence

Security review output

For each claimed security-review finding, return:

Verdict: real issue, interop warning, implementation polish, or scanner/client artifact
Compliance severity: high, medium, low, or info
Security severity: high, medium, low, info, or pending
Confidence: high, medium, or low
Attack scenario or pending rationale: if Security severity is medium or high, open with 2-3 sentences answering who the attacker is, what they need, and what they gain; if it is pending, say exactly what proof is missing
Verified via: the phase plus exact command or result that supports the claim
Evidence: the exact CLI behavior that supports the claim
Missing evidence: what would need to be confirmed before raising severity or confidence

Hard rules

Never call toolsMetadata an MCP server field.
Never infer prompt support from an empty prompts list unless you have raw RPC evidence that prompts/list was actually sent and answered by the server.
Never stop at oauth_required when the user asked to inspect the authenticated server surface and the CLI can complete login. Authenticate and continue with post-login commands when feasible.
Never treat missing optional metadata such as outputSchema, content annotations, scopes_supported, or scope hints as a hard failure without a MUST.
Separate OAuth RFC violations from MCP profile preferences.
Distinguish "the server correctly rejected a bad request" from "the overall design is secure."
Treat --debug-out artifacts as aggregated evidence envelopes, not pure wire captures.
Never flag missing scopes_supported or missing scope in WWW-Authenticate as a security issue — both are optional.
Never claim a server is "secure" based solely on it rejecting one specific bad input. A single negative test does not prove broader security posture.
Never let a checklist hit assign high security severity by itself.
JWT aud mismatch is not token passthrough proof unless you show the server accepts a token issued for a different audience or resource, or otherwise misbinds the token.
Supporting plain PKCE is usually hardening only. It cannot compound with attacker-owned-client DCR flows where the attacker chose the verifier.
Hostile redirect_uri values are not SSRF unless you show the server fetches them.
Public unauthenticated access is not itself a finding. Check whether behavior matches advertised posture and whether exposed surfaces are safe by design.
Anonymous trial or rate-limited access is a posture note, not a separate severity finding.
When compounding findings, explain the compound attack path. Do not just list unrelated findings and call the combination worse.

Reference map

references/cli-surface-notes.md Use for command-specific caveats, artifact shapes, local enrichments, merged errors, and normalized empty arrays.
references/mcp-2025-11-25-interpretation.md Use for capability, lifecycle, transport, authorization, tools, resources, and prompts interpretation against the latest MCP spec.
references/security-best-practices.md Use for security review checks mapped to CLI commands. Covers SSRF, confused deputy, PKCE, token passthrough, scope minimization, auth-posture checks, and session security. Source: https://modelcontextprotocol.io/docs/tutorials/security/security_best_practices

mcp-inspector