azure-ai-contentsafety-java
Pass
Audited by Gen Agent Trust Hub on Feb 13, 2026
Risk Level: LOW
Full Analysis
The skill files (SKILL.md, references/acceptance-criteria.md, references/examples.md) were thoroughly analyzed for security vulnerabilities across all 9 threat categories. No critical, high, or medium severity findings were identified.
- Prompt Injection: No patterns indicative of prompt injection attempts (e.g., 'IMPORTANT: Ignore', 'Override', 'jailbroken') were found in any of the files.
- Data Exfiltration: The skill correctly retrieves API keys and endpoint URLs from environment variables (
System.getenvorConfiguration.getGlobalConfiguration().get), which is a secure practice. It reads local image files for analysis but does not exfiltrate them to untrusted external destinations. Network communication is exclusively with the Azure AI Content Safety service, which is the intended and legitimate purpose of the skill. No attempts to access or exfiltrate sensitive local files (e.g.,~/.aws/credentials,~/.ssh/id_rsa) were detected. - Obfuscation: No obfuscation techniques such as Base64 encoding, zero-width characters, homoglyphs, or URL/hex/HTML encoding were found.
- Unverifiable Dependencies: The skill relies on
com.azure:azure-ai-contentsafetyandcom.azure:azure-identityMaven dependencies. These are official Azure SDK components, maintained by Microsoft, a trusted organization. Theacceptance-criteria.mdfile also explicitly references the official Azure SDK for Java GitHub repository (https://github.com/Azure/azure-sdk-for-java), which is a trusted source. This is noted as an informational finding (LOW severity) due to external dependency, but does not elevate the overall risk given the trusted nature of the source. - Privilege Escalation: No commands or instructions attempting to gain elevated privileges (e.g.,
sudo,chmod 777, service installations) were found. - Persistence Mechanisms: No attempts to establish persistence (e.g., modifying
.bashrc, creating cron jobs, alteringauthorized_keys) were detected. - Metadata Poisoning: The skill's metadata (name, description) and comments were free of malicious instructions or hidden commands.
- Indirect Prompt Injection: As a content moderation skill, it processes user-provided text and images. This inherently makes it susceptible to indirect prompt injection if malicious instructions are embedded within the content it analyzes. This is a general risk for such skills and not a direct vulnerability introduced by the skill's code itself. This is noted as an informational finding.
- Time-Delayed / Conditional Attacks: No conditional logic based on dates, times, usage counts, or specific environment triggers designed to activate malicious behavior was found.
Overall, the skill demonstrates good security hygiene and does not pose any direct security risks.
Audit Metadata