vector-db-ingest
Pass
Audited by Gen Agent Trust Hub on Mar 12, 2026
Risk Level: SAFEEXTERNAL_DOWNLOADSREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- [EXTERNAL_DOWNLOADS]: The
init.pyscript installs Python dependencies from PyPI, and theoperations.pymodule downloads thenomic-ai/nomic-embed-text-v1.5model from Hugging Face. These operations are directed at well-known services and are standard for the skill's functionality.\n- [REMOTE_CODE_EXECUTION]: Inoperations.py, theHuggingFaceEmbeddingsclass is initialized withtrust_remote_code=True. While this enables the execution of code from the remote model repository, the risk is mitigated by the use of a reputable model from a trusted vendor.\n- [COMMAND_EXECUTION]: Theinit.pyscript utilizessubprocess.check_callto install required packages viapip. This is a standard initialization procedure for Python-based agent skills.\n- [PROMPT_INJECTION]: The ingestion pipeline processes repository files and stores them in the vector database without sanitization or filtering. This creates a surface for indirect prompt injection, where malicious instructions embedded in a project file could be retrieved and executed by an agent during a RAG-based query.\n - Ingestion points:
scripts/ingest.pyreads files matching the project manifest configuration.\n - Boundary markers: The system does not currently implement boundary markers or instructions to ignore embedded commands within the indexed content.\n
- Capability inventory: Initialization commands (
init.py), file system access, and network operations for database connectivity.\n - Sanitization: Content is indexed as raw text or parsed via standard language shims without safety validation.\n- [SAFE]: The
ingest_code_shim.pyscript uses the standardxml.etree.ElementTreelibrary for parsing XML files. As a best practice to prevent potential XML External Entity (XXE) attacks from malicious files, the use of a hardened library such asdefusedxmlis recommended.
Audit Metadata