Evaluate Fusion MCP Search Index

When to use

Use this skill to evaluate whether Fusion MCP search returns accurate, relevant results for documented framework patterns.

Typical triggers:

eval core — evaluate all patterns in eval/index/core.md
eval http-services — evaluate patterns in eval/index/http-services.md
eval all — evaluate every domain file in eval/index/
"check MCP index accuracy for auth patterns"
"validate search recall for the state-data domain"
"how well does the index cover framework initialization?"
"run an eval pass against the index"

When not to use

Do not use this skill for:

Writing or populating domain pattern files (use normal editing workflow)
Creating new domain files in eval/index/ (follow the README template)
CI/CD integration or scheduled evaluation runs (future phase)
Batch automation without human review (future phase)
Searching the index for application development answers (use Fusion MCP directly)

Required inputs

Collect before execution:

Domain target: a specific domain file name (e.g., core, http-services) or all
Evaluation strictness (optional): strict (must patterns only) or full (must + should); default is full

If the user says only eval with no domain, ask which domain to evaluate or whether to run all.

Instructions

Step 1 — Resolve domain files

If the target is a specific domain (e.g., core), read eval/index/<domain>.md
If the target is all, list all .md files in eval/index/ except README.md and process each one sequentially
If the file does not exist or has no ## query sections, report it as empty and skip

Step 2 — Parse the domain file

Domain files have a simple structure:

# Domain Name — the domain heading
Paragraph below # — judgement instructions for how to evaluate results across all queries in this domain
## <query> — each ## heading is a search query to run against MCP
- must ... / - should ... — expectations for each query's results

Extract the judgement instructions once (use them as context when evaluating every query). Then build a list of queries, each with its must and should expectations.

Step 3 — Evaluate each query via judge sub-agent

For each ## heading, spin up a query-judge sub-agent (see agents/query-judge.md). Pass it:

The heading text (search query)
The must and should bullets for that query
The domain-level judgement instructions

The judge sub-agent will:

Search MCP using the heading text
Check results against each expectation
Return a verdict (pass / partial / fail) with counts and explanation

If the runtime does not support sub-agents, follow the same workflow inline:

Use the heading text as the search query
Call mcp_fusion_search_framework (preferred) or mcp_fusion_search
Check results against each must / should bullet
Record verdict with explanation

Collect all verdicts before producing the report.

Step 4 — Generate the evaluation report

Produce a report following the template in assets/report-template.md. The report includes:

Header: domain name, evaluation date, strictness level
Pattern results table: one row per pattern with name, requirement, verdict, and explanation
Summary statistics: total patterns, pass/partial/fail counts, pass rate
Recommendations: actionable next steps for failed or partial patterns

Step 5 — Present results

Print the full report to the conversation
For eval all, present a per-domain summary first, then offer to show detailed results for any domain
Highlight must failures prominently — these indicate critical index gaps
Suggest concrete remediation for each failure (e.g., "re-index package X", "add documentation for Y")

Expected output

A structured evaluation report containing:

Per-pattern pass/partial/fail verdicts with explanations
Summary statistics (total, pass rate, must vs should breakdown)
Prioritized recommendations for improving index coverage
Clear identification of stale or missing content

Example: `eval core`

User: eval core

Workflow:

Read eval/index/core.md
Note judgement instructions: "Results should reference @equinor/fusion-framework and @equinor/fusion-framework-module..."
Extract 5 queries from ## headings
For each, search MCP and check must/should bullets
Produce report:

## Evaluation Report: core

Date: 2026-03-14
Strictness: full
Domain: eval/index/core.md

| # | Query | Verdict | Explanation |
|---|-------|---------|-------------|
| 1 | How to initialize Fusion Framework | pass | Results mention FrameworkConfigurator, init, configureMsal |
| 2 | How to create a custom module | pass | Module interface, BaseConfigBuilder, BaseModuleProvider all covered |
| 3 | Module lifecycle phases | partial | Lifecycle order shown but postConfigure/postInitialize hooks not detailed |
| 4 | How to configure an app with AppModuleInitiator | fail | No results reference AppModuleInitiator |
| 5 | How to listen to framework events | pass | addEventListener, dispatchEvent, event module all returned |

### Summary
- Queries: 5 | Pass: 3 | Partial: 1 | Fail: 1
- Must expectations met: 14/17 (82%)

### Recommendations
1. **[CRITICAL]** Query 4: Re-index `packages/app/src/types.ts` and `cookbooks/app-react/src/config.ts`
2. **[IMPROVE]** Query 3: Enrich lifecycle section in `packages/modules/module/README.md`

Safety & constraints

This skill is read-only with respect to the repository — it never modifies domain files or index content
MCP search calls are read-only queries; no mutations are performed
Do not fabricate pass verdicts — if results are ambiguous, mark as partial with explanation
If MCP is unavailable or rate-limited, report the failure clearly and stop; do not retry in a loop
Evaluation results reflect index state at query time; they are not cached across sessions

custom-index-eval