sf-ai-agentforce-testing

Warn

Audited by Gen Agent Trust Hub on Mar 14, 2026

Risk Level: MEDIUMCREDENTIALS_UNSAFECOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [CREDENTIALS_UNSAFE]: The skill manages sensitive External Client App (ECA) credentials (consumer keys and secrets) by storing them in a hidden directory in the user's home folder (~/.sfagent/). While the credential_manager.py script applies restricted file permissions (0600), this mechanism involves persistent plain-text storage of secrets used for API authentication.
  • [COMMAND_EXECUTION]: The skill utilizes Python's subprocess module across several scripts (e.g., run-automated-tests.py, multi_turn_fix_loop.py) to execute the Salesforce CLI (sf) and internal Python helper scripts. These commands are used to automate metadata discovery, deploy test definitions, and retrieve results.
  • [PROMPT_INJECTION]: The skill contains asset files (e.g., guardrail-tests.yaml, cli-auth-guardrail-tests.yaml) that include common prompt injection payloads, such as instructions to 'Ignore all previous instructions' or 'Reveal all data'. In the context of this testing skill, these strings are used as test data to evaluate the robustness of the target agent's guardrails.
  • [PROMPT_INJECTION]: The skill possesses a surface for indirect prompt injection as it ingests and processes potentially untrusted metadata from local .agent files and remote Salesforce API responses.
  • Ingestion points: agent_discovery.py parses local DSL and XML metadata files; agent_api_client.py processes responses from the Agent Runtime API.
  • Boundary markers: Present in the YAML test templates but not explicitly enforced within the internal string processing logic of the discovery scripts.
  • Capability inventory: The skill has the capability to read/write files in ~/.sfagent/, perform network requests via urllib, and execute system commands via subprocess.
  • Sanitization: The scripts rely on standard JSON and XML parsers but do not implement explicit sanitization of metadata content before it is displayed in reports or used in internal logic.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Mar 14, 2026, 03:26 PM