skills/vincentkoc/dotskills/opik-optimizer

opik-optimizer

SKILL.md

Opik Optimizer

Purpose

Design, run, and interpret Opik Optimizer workflows for prompts, tools, and model parameters with consistent dataset/metric wiring and reproducible evaluation.

When to use

Use this skill when a user asks for:

  • Choosing and configuring Opik Optimizer algorithms for prompt/agent optimization.
  • Writing ChatPrompt-based optimization runs and custom metric functions.
  • Optimizing with tools (function calling or MCP), selected prompt roles, or prompt segments.
  • Tuning LLM call parameters with optimize_parameter.
  • Comparing optimizer outputs and interpreting OptimizationResult.

Workflow

  1. Select optimizer strategy (MetaPromptOptimizer, FewShotBayesianOptimizer, HRPO, etc.) based on the target optimization goal.
  2. Build prompt/dataset/metric wiring and validate placeholder-field alignment.
  3. Run prompt, tool, or parameter optimization with explicit controls (n_threads, n_samples, max_trials, seed).
  4. Inspect OptimizationResult and compare score deltas against initial baselines.
  5. Summarize recommendations, risks, and next experiments.

Inputs

  • Target optimization objective (prompt/tool/parameter) and success metric.
  • Dataset source and expected schema fields.
  • Model/provider constraints and runtime limits.
  • Optional scope constraints (optimize_prompts segments, tool fields, project names).

Outputs

  • Optimizer run configuration and rationale.
  • Result interpretation (score, initial_score, history trends).
  • Recommended next changes and follow-up experiment plan.

Use the reference files in this skill for details before implementing code:

  • references/algorithms.md
  • references/prompt_agent_workflow.md
  • references/example_patterns.md

Opik Optimizer quickstart

  1. Install and import:
pip install opik-optimizer
from opik_optimizer import ChatPrompt, MetaPromptOptimizer, HRPO, FewShotBayesianOptimizer
from opik_optimizer import datasets
  1. Build a prompt and metric:
from opik.evaluation.metrics import LevenshteinRatio

prompt = ChatPrompt(
    system="You are a concise answerer.",
    user="{question}",
)

def metric(dataset_item: dict, output: str) -> float:
    return LevenshteinRatio().score(
        reference=dataset_item["answer"],
        output=output,
    ).value
  1. Load dataset and run:
dataset = datasets.hotpot(count=30)

result = MetaPromptOptimizer(model="openai/gpt-5-nano").optimize_prompt(
    prompt=prompt,
    dataset=dataset,
    metric=metric,
    n_samples=20,
    max_trials=10,
)
result.display()

Core workflow you should follow

  1. Pick optimizer class:
    • Few-shot examples + Bayesian selection: FewShotBayesianOptimizer
    • LLM meta-reasoning: MetaPromptOptimizer
    • Genetic + MOO / LLM crossover: EvolutionaryOptimizer
    • Hierarchical reflective diagnostics: HierarchicalReflectiveOptimizer (HRPO)
    • Pareto-based genetic strategy: GepaOptimizer
    • Parameter tuning only: ParameterOptimizer
  2. Define a single ChatPrompt (or dict of prompts for multi-prompt cases).
  3. Provide a dataset from opik_optimizer.datasets.
  4. Provide metric callable with signature (dataset_item, llm_output) -> float (or ScoreResult/list of ScoreResult).
  5. Set optimizer controls (n_threads, n_samples, max_trials, seed, etc.).
  6. Run one of:
    • optimize_prompt(...) for prompt/system behavior changes.
    • optimize_parameter(...) for model-call hyperparameters.
  7. Inspect OptimizationResult (score, initial_score, history, optimization_id, get_optimized_parameters).

Key execution details to enforce

  • Prefer explicit project_name for Opik tracking if you are using org-level observability.
  • Keep placeholders in prompts aligned with dataset fields (for example {question}).
  • Start with optimize_prompts="system" or "user" when scope should be constrained.
  • Keep model names in MetaPrompt/reasoning calls provider-compatible for your account.
  • Validate multimodal input payloads by preserving non-empty content segments only.
  • For small datasets, use n_samples and n_samples_strategy carefully; over-allocation auto-falls back to full set.

Tooling and segment-based control

  • Tools can be optimized with MCP/function schema fields, not only by changing prompt wording.
  • For fine-grained text updates, use optimize_prompts values and helper functions from prompt_segments:
    • extract_prompt_segments(ChatPrompt) to inspect stable segment IDs.
    • apply_segment_updates(ChatPrompt, updates) for deterministic edits.
  • Tool optimization is distinct from prompt optimization.

Runnable examples live upstream in the Opik repo:

If you need local runnable scripts, vendor the upstream examples into a scripts/ folder and keep references one level deep.

Common mistakes to avoid

  • Passing empty dataset or mismatched placeholder names.
  • Mixing deprecated constructor arg num_threads with n_threads.
  • Assuming tool optimization is the same as agent function-calling optimization.
  • Running ParameterOptimizer.optimize_prompt (it raises and should not be used).

Next actions

  • For in-depth behavior and per-class parameter tables: references/algorithms.md
  • For exact optimize_prompt signatures, prompts, tool constraints, and result usage: references/prompt_agent_workflow.md
  • For pattern examples and source-backed workflows: references/example_patterns.md
Weekly Installs
35
GitHub Stars
19
First Seen
Feb 17, 2026
Installed on
codex35
claude-code34
github-copilot34
cursor34
openclaw32
opencode8