pdf-parser
Fail
Audited by Gen Agent Trust Hub on Mar 10, 2026
Risk Level: CRITICALREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
- [REMOTE_CODE_EXECUTION]: The
package.jsonfile specifies a dependency onpdf-parsewith version^2.4.5. According to the official NPM registry, thepdf-parsepackage has not been updated in years and its latest stable version is1.1.1. Specifying a significantly higher, non-existent version is a definitive indicator of a dependency confusion attack, where a malicious package with the higher version number is published to a public registry to override legitimate internal or older packages. - [COMMAND_EXECUTION]: The skill instructions in
SKILL.mddirect the agent to executenpm install --silentwithin the scripts directory. This command will trigger the download and installation of the suspiciouspdf-parse@2.4.5package, potentially executing malicious pre-install or post-install scripts on the host system. - [DATA_EXFILTRATION]: The JavaScript code in
extract_pdf_text.jsutilizes an API (new PDFParse) that does not exist in the legitimatepdf-parselibrary. This confirms the script is designed to interact with a compromised or malicious version of the library. Given the script has direct access to the file system (fs.readFileSync), a malicious dependency could easily exfiltrate the contents of processed PDFs or other sensitive files to an external server. - [PROMPT_INJECTION]: The skill exhibits a high surface area for indirect prompt injection as it processes untrusted external data (PDFs) and extracts text for agent use.
- Ingestion points: The
extract_pdf_text.jsscript reads binary data from an arbitrary file path provided as{입력.pdf}. - Boundary markers: The skill lacks any delimiters or instructions to the agent to ignore potentially malicious commands embedded within the extracted PDF text.
- Capability inventory: The skill has the ability to read from and write to any path on the file system via the
fsmodule. - Sanitization: There is no validation or sanitization of the extracted text content before it is written to the output file or returned to the agent context.
Recommendations
- AI detected serious security threats
Audit Metadata