03-deduplication
Pass
Audited by Gen Agent Trust Hub on Mar 8, 2026
Risk Level: SAFE
Full Analysis
- [SAFE]: No malicious patterns, obfuscation, or unauthorized data access techniques were found in the provided files. The skill adheres to legitimate data engineering practices for Delta Lake environments.
- [COMMAND_EXECUTION]: The script
scripts/check_duplicates.pyutilizes the PySpark API to validate and count duplicates in Delta tables. This functionality is consistent with its stated purpose of data quality management and does not involve arbitrary system command execution. - [PROMPT_INJECTION]: The skill's instructions and metadata were analyzed for bypass markers and override attempts; no such patterns were detected. Regarding indirect injection risks, the skill ingests data from Spark tables via
spark.table(). While it lacks explicit boundary markers or sanitization for this external data, the capability inventory is limited to structured data operations (deduplication and merging) and lacks dangerous primitives like dynamic code execution or network exfiltration. - [DATA_EXFILTRATION]: Data operations are performed within the Spark environment using standard table access methods. No evidence of hardcoded credentials, sensitive file path access, or unauthorized data transfer to external domains was found.
Audit Metadata