ai-parsing-data

Pass

Audited by Gen Agent Trust Hub on May 1, 2026

Risk Level: SAFE
Full Analysis
  • [SAFE]: The skill implements standard data engineering and AI orchestration patterns using reputable libraries such as dspy, pydantic, pandas, and langfuse. It follows security best practices by using structured schemas and validation for data extraction.
  • [DATA_EXPOSURE]: No access to sensitive system paths (e.g., .ssh, .aws, .env) or hardcoded credentials was found. Data operations are confined to local files (CSV, JSON, VTT) and the Langfuse tracing service for AI observability.
  • [REMOTE_CODE_EXECUTION]: No remote code execution patterns, unsafe shell commands, or dynamic code generation from untrusted sources were detected. The skill uses standard file reading and JSON/CSV parsing methods.
  • [PROMPT_INJECTION]: The instructions and examples do not contain patterns intended to bypass AI safety guardrails, override system prompts, or extract internal instructions.
  • [INDIRECT_PROMPT_INJECTION]: The skill is designed to process untrusted external data such as resumes, invoices, and call transcripts.
  • Ingestion points: Reads text from local files (SKILL.md) and fetches data from the Langfuse API.
  • Boundary markers: Relies on dspy.Signature docstrings to define extraction tasks. While it lacks explicit 'ignore instructions' delimiters for the input text, it uses Chain-of-Thought reasoning to improve parsing accuracy.
  • Capability inventory: Extracted data is formatted into JSON/CSV or printed; it is not passed to dangerous sinks like shell execution or code evaluators.
  • Sanitization: Employs Pydantic models for strict schema validation and dspy.Suggest/Assert for logical verification of extracted fields (e.g., validating email formats or phone number length).
Audit Metadata
Risk Level
SAFE
Analyzed
May 1, 2026, 12:59 PM