agently-embeddings
Agently Embeddings
This skill covers embedding requests in Agently. It focuses on OpenAICompatible embeddings setup, request shape, input batching, async usage, parsed vector results, and the embedding-agent handoff used by vector-store integrations. It does not cover general chat/completions setup, structured output control, prompt-template management, or full retrieval pipeline design.
Prerequisite: Agently >= 4.0.8.5.
Agently is async-first at the runtime layer. Prefer async_start() or async_get_data() when the caller can use async APIs. Use batching first for texts that belong to one embeddings job, then use async concurrency for overlapping embedding jobs.
Scope
Use this skill for:
- configuring
OpenAICompatiblewithmodel_type="embeddings" - choosing between
base_url,full_url, auth, proxy, timeout,client_options, andrequest_optionsfor embeddings - understanding that embeddings requests are built from
input, not from chat-style prompt assembly - sending one text or a batch of texts through
input(...) - understanding how non-scalar input is serialized before it is sent
- consuming parsed embedding vectors through
start(),get_data(),async_start(), orasync_get_data() - understanding the parsed result shape for single-input and batch-input requests
- using an embedding agent as the handoff point for vector-store integrations such as Chroma
- organizing offline indexing or backfill jobs
- organizing low-latency online query embedding
Do not use this skill for:
- Chat LLM, Completions LLM, or VLM setup
.output(...),ensure_keys, or structured-output retries- prompt-slot composition beyond the embeddings
input - retrieval ranking, knowledge-base orchestration, or answer-generation logic
- full Chroma or RAG workflow design
Minimal Request Boundary
For embeddings, input(...) is the real payload.
info(...), instruct(...), examples(...), output(...), and attachment(...) are not the request body for the embeddings model type.
Workflow
- If the task is about model type, endpoint shape, or which settings matter for embeddings, read references/config-and-request-shape.md.
- If the task is about single input, batch input, async usage, or throughput guidance, read references/input-batching-and-async.md.
- If the task is about what the returned data looks like or which getter to use, read references/result-shape-and-consumption.md.
- If the task is about offline indexing, backfill jobs, or online query embedding, read references/production-scenarios.md.
- If the task is about passing an embedding agent into a vector-store integration, read references/vector-store-handoff.md.
- If behavior still looks wrong, use references/troubleshooting.md.
Core Mental Model
Agently embeddings are simpler than chat or completions requests:
OpenAICompatiblemust usemodel_type="embeddings".- The request body is built from
input(...). - A single input becomes one embeddings request; a list input becomes one batch embeddings request.
- Parsed result data is a list of embedding vectors.
- Embeddings are not a streaming-response workflow. Treat the
streamparameter as irrelevant for this model type. - The same embedding agent can then be reused as the embedding function for vector-store integrations.
Selection Rules
- embedding endpoint setup or request-body shape ->
config-and-request-shape.md - one text or one list of texts ->
input(...) - many texts that belong to the same embedding job -> batch them in one request first
- async service or overlapping embedding jobs -> prefer
async_start()/async_get_data() - normal embedding result consumption ->
start()orget_data() - meta or original payload inspection ->
get_response()first, then read fromresponse.result - embeddings always return one completed result rather than
delta/instantstreaming events - offline indexing or backfill -> batch within one job first, then overlap jobs with async concurrency
- online query embedding -> prefer one short request per user query and keep the path latency-oriented
- vector-store handoff ->
vector-store-handoff.md - do not treat
.output(...)orget_data_object()as the normal embeddings path
References
references/source-map.mdreferences/config-and-request-shape.mdreferences/input-batching-and-async.mdreferences/production-scenarios.mdreferences/result-shape-and-consumption.mdreferences/vector-store-handoff.mdreferences/troubleshooting.md