The Agent Skills Directory

[EXTERNAL_DOWNLOADS]: The skill instructs the agent to install the geniml package from PyPI and from the databio/geniml GitHub repository. It also requires the installation of the StarSpace tool from github.com/facebookresearch/StarSpace. While the Facebook repository is an official research source from a well-known entity, the databio repository is a third-party source.
[REMOTE_CODE_EXECUTION]: The skill uses ScEmbed.from_pretrained to download and load machine learning models from Hugging Face (e.g., databio/scembed-pbmc-10k). Although Hugging Face is a well-known service, loading serialized models can involve executing remote code.
[COMMAND_EXECUTION]: The skill frequently executes CLI tools and shell commands (such as geniml, uniwig, and cat) via subprocesses to process genomic data and manage the analysis workflow.
[DATA_EXFILTRATION]: The BBClient utility performs network operations to fetch genomic files from remote BEDbase repositories and stores them in a local cache directory (~/.bedcache).
[PROMPT_INJECTION]: The skill processes untrusted external data (BED and CSV files) which constitutes an indirect prompt injection surface. While standard in bioinformatics, these inputs could potentially be crafted to influence agent behavior through model responses.
Ingestion points: Genomic interval data (BED files), AnnData objects, and metadata CSV files containing experimental labels.
Boundary markers: None identified in the provided data processing snippets.
Capability inventory: File system writes (output directories), subprocess execution (CLI tools), and network connectivity (data and model downloads).
Sanitization: Standard biological data parsing is performed, but no explicit sanitization for adversarial prompt content was observed.

geniml