The Agent Skills Directory

[PROMPT_INJECTION]: The skill is vulnerable to indirect prompt injection due to its handling of external data sources.
Ingestion points: The workflow reads content from EXPERIMENT_LOG.md, EXPERIMENT_TRACKER.md, docs/research_contract.md, and training log files.
Boundary markers: The prompt for the Codex evaluation tool uses simple text headers (e.g., 'Results:', 'Baselines:') that do not securely encapsulate data or prevent embedded instructions from influencing the model.
Capability inventory: The skill is authorized to use Bash(*), Write, and Edit tools, which could be exploited to run malicious commands or modify the project filesystem if the LLM is compromised via injected instructions.
Sanitization: There is no evidence of sanitization, validation, or escaping of the ingested results before they are processed by the evaluation tool.
[COMMAND_EXECUTION]: The skill performs shell command execution to retrieve data and update local project state.
Suggests the use of ssh to retrieve logs from remote servers.
Executes the local script tools/research_wiki.py with arguments derived from the analysis results to update the project wiki.

result-to-claim