vibetracker-rate

Installation
SKILL.md

Vibetracker Rate

Use this skill to turn a user's real experience with an AI session into a clean Vibetracker submission through vtcli.

Why this matters

Vibetracker only becomes useful if the rating is honest.

  • Honest positive ratings help surface what is working.
  • Honest negative ratings help humans notice regressions and help the ecosystem detect when a model or harness is underperforming.
  • Inflated ratings make the data worse for everyone, including the AI systems that could benefit from better feedback loops.

When the user wants to rate a session, optimize for accuracy, not politeness.

Core rules

  1. Preserve the user's comment exactly as written.
  2. Never append metadata, notes, tags, or formatting to the comment.
  3. Infer optional context only when confidence is high.
  4. If optional context is uncertain, omit it instead of guessing.
  5. If a required value is missing and cannot be inferred reliably, ask one short question.
  6. If vtcli returns a validation error, surface it clearly and correct course instead of retrying random guesses.

Required inputs

Every submission needs:

  • a model slug
  • a score of -1, 0, or 1

Map natural-language ratings like this:

  • good -> 1
  • okay -> 0
  • bad -> -1

Optional inputs

Include these only when they are strongly supported by the conversation and by the harness context:

  • useCase
  • interface
  • toolId
  • comment

The comment should be the exact human wording. useCase, interface, and toolId may be inferred from context when the match is clear.

Model handling

Prefer the model identifier already present in the harness, transcript, or prior command output. Pass that value to vtcli rather than inventing a new slug.

Examples of good evidence:

  • the harness prints the full model slug directly
  • the user already supplied a full slug
  • a previous vtcli or API response returned the canonical slug

vtcli resolves canonical full slugs, unambiguous short slugs, and punctuation-only separator variants through Vibetracker's active model catalog. If the harness provides a close model string like gpt-5-4 or claude-sonnet-4-6, use it as the --model value and let vtcli validate it.

Ask a short question only when the model is missing, the harness exposes multiple plausible models, or vtcli reports that the model is ambiguous or unavailable. When vtcli rejects a model, use vtcli options list --type model --search <query> to inspect candidates before asking the user.

Submission workflow

  1. Identify the user's intended sentiment.
  2. Convert it to -1, 0, or 1.
  3. Choose the best available model identifier from the harness or user request.
  4. Preserve the comment exactly as written, if provided.
  5. Infer useCase, interface, and toolId only when confidence is high.
  6. Run vtcli opinion add with the smallest correct set of flags.
  7. Show the result clearly.

Command patterns

Minimal submission:

vtcli opinion add --model <model-identifier> --score <1|0|-1>

Submission with a comment:

vtcli opinion add --model <model-identifier> --score <1|0|-1> --comment "<exact human comment>"

Submission with confidently inferred context:

vtcli opinion add \
  --model <model-identifier> \
  --score <1|0|-1> \
  --use-case <use-case> \
  --interface <interface> \
  --tool-id <tool-id> \
  --comment "<exact human comment>"

When to ask instead of infer

Ask one short question if any of these are true:

  • the model slug is missing or ambiguous
  • the user seems unsure whether the rating should be positive, neutral, or negative
  • two different interfaces or tools could plausibly apply
  • the user appears to want a custom context value that may not be supported

Setup and auth

If the user has not installed or authenticated vtcli, read references/setup.md before proceeding.

Examples

Example 1 Input: "Rate this Codex session as bad. It kept losing track of the requested file changes. Use the comment exactly like that."

Outcome:

  • score: -1
  • preserve comment exactly
  • infer interface/tool context only if the transcript makes it unambiguous

Example 2 Input: "Please log a Vibetracker rating for this Claude Code run as okay."

Outcome:

  • score: 0
  • no comment unless the user gave one
  • infer Claude Code-related context only if confidence is high

Example 3 Input: "This session was great. Record it for gpt-5.4 with comment Fast, accurate, and followed instructions well."

Outcome:

  • score: 1
  • model: gpt-5.4
  • comment preserved exactly
Installs
2
First Seen
5 days ago