ai-parsing-data
Pass
Audited by Gen Agent Trust Hub on May 1, 2026
Risk Level: SAFE
Full Analysis
- [SAFE]: The skill implements standard data engineering and AI orchestration patterns using reputable libraries such as
dspy,pydantic,pandas, andlangfuse. It follows security best practices by using structured schemas and validation for data extraction. - [DATA_EXPOSURE]: No access to sensitive system paths (e.g.,
.ssh,.aws,.env) or hardcoded credentials was found. Data operations are confined to local files (CSV, JSON, VTT) and the Langfuse tracing service for AI observability. - [REMOTE_CODE_EXECUTION]: No remote code execution patterns, unsafe shell commands, or dynamic code generation from untrusted sources were detected. The skill uses standard file reading and JSON/CSV parsing methods.
- [PROMPT_INJECTION]: The instructions and examples do not contain patterns intended to bypass AI safety guardrails, override system prompts, or extract internal instructions.
- [INDIRECT_PROMPT_INJECTION]: The skill is designed to process untrusted external data such as resumes, invoices, and call transcripts.
- Ingestion points: Reads text from local files (
SKILL.md) and fetches data from the Langfuse API. - Boundary markers: Relies on
dspy.Signaturedocstrings to define extraction tasks. While it lacks explicit 'ignore instructions' delimiters for the input text, it uses Chain-of-Thought reasoning to improve parsing accuracy. - Capability inventory: Extracted data is formatted into JSON/CSV or printed; it is not passed to dangerous sinks like shell execution or code evaluators.
- Sanitization: Employs Pydantic models for strict schema validation and
dspy.Suggest/Assertfor logical verification of extracted fields (e.g., validating email formats or phone number length).
Audit Metadata