skills/axiomhq/skills/writing-evals/Gen Agent Trust Hub

writing-evals

Pass

Audited by Gen Agent Trust Hub on Mar 8, 2026

Risk Level: SAFECOMMAND_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTIONCREDENTIALS_UNSAFE
Full Analysis
  • [PROMPT_INJECTION]: The skill scaffolds TypeScript files (*.eval.ts) based on natural language descriptions of AI capabilities and steps.
  • Ingestion points: Capability and step names/descriptions provided during the scaffolding process (referenced in SKILL.md).
  • Boundary markers: Templates in reference/templates/ use TODO comments to indicate where user logic should be inserted.
  • Capability inventory: The generated files execute project functions via the task property and are invoked using the npx axiom eval command.
  • Sanitization: No explicit sanitization of user-provided strings is performed during scaffolding in scripts/eval-scaffold.
  • [COMMAND_EXECUTION]: Multiple bash scripts facilitate the initialization, scaffolding, and execution of evaluations.
  • Evidence: scripts/eval-run and scripts/eval-list execute commands through npx axiom eval.
  • Evidence: scripts/eval-scaffold uses sed to perform string replacement in templates based on capability and step arguments.
  • Evidence: scripts/eval-results dynamically locates and executes an axiom-query script from a sibling skill (axiom-sre).
  • [EXTERNAL_DOWNLOADS]: The skill relies on external packages and vendor-controlled endpoints for its core functionality.
  • Evidence: Documentation and scripts (setup, eval-init) reference the axiom package on the NPM registry.
  • Evidence: Configuration templates point to the official Axiom API at https://api.axiom.co.
  • [CREDENTIALS_UNSAFE]: The skill manages authentication secrets for interaction with the Axiom platform.
  • Evidence: README.md and scripts mention the AXIOM_TOKEN environment variable and use placeholders like xaat-your-token.
  • Evidence: reference/templates/instrumentation.ts contains code to attach tokens to authorization headers for telemetry export.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 8, 2026, 04:32 PM