skills/microsoft/skills-for-fabric/best-practices-check

best-practices-check

Installation
SKILL.md

Best Practices Verification

Verify that skills in this repository align with current Microsoft Fabric best practices from official documentation and community sources.

When to Use

  • After creating or updating a skill
  • To ensure a skill covers recommended patterns
  • To identify gaps in guidance
  • Before major releases to validate content currency

Workflow

Step 1: Identify Target Skill

Parse the user's request to identify which skill to verify:

User Request Target Skill
"check best practices for spark consumption" spark-consumption-cli
"validate best practices for sqldw authoring" sqldw-authoring-cli
"best practices for medallion" skills/e2e-medallion-architecture/SKILL.md
"check best practices for SQL endpoint" sqldw-consumption-cli

Skill name normalization:

  • "spark" → spark-authoring-cli or spark-consumption-cli (ask if ambiguous)
  • "sqldw", "warehouse", "SQL endpoint" → sqldw-authoring-cli or sqldw-consumption-cli
  • "medallion", "bronze/silver/gold" → skills/e2e-medallion-architecture/SKILL.md
  • "data engineering" → spark-authoring-cli

Step 2: Read Skill Content

Load the target skill's SKILL.md and any referenced resources:

# Example: Read skill content
cat skills/spark-consumption-cli/SKILL.md

# If skill has resources folder, read those too
ls skills/spark-consumption-cli/resources/ 2>/dev/null && \
  cat skills/spark-consumption-cli/resources/*.md

Extract key topics covered:

  • Must/Prefer/Avoid guidance
  • Specific patterns mentioned
  • Technologies referenced
  • Example scenarios

Step 3: Search for Current Best Practices

Use web search to find current Microsoft Fabric best practices. Always include "Microsoft Fabric" in search queries to ensure results are Fabric-specific.

Search queries to execute (adjust based on skill topic):

Skill Type Search Queries
Spark/Data Engineering "Microsoft Fabric Spark best practices 2025", "Fabric Lakehouse optimization", "Fabric notebook development best practices"
SQL Endpoint/Warehouse "Microsoft Fabric Data Warehouse best practices", "Fabric T-SQL performance optimization", "Fabric SQL endpoint security best practices"
Medallion Architecture "Microsoft Fabric medallion architecture best practices", "Fabric Bronze Silver Gold layer design", "Fabric lakehouse data modeling"
General "Microsoft Fabric {topic} best practices", "Fabric {topic} performance tuning", "Fabric {topic} security"

Priority sources (weight these higher):

  1. learn.microsoft.com/en-us/fabric/ — Official Microsoft documentation
  2. blog.fabric.microsoft.com/ — Official Fabric blog
  3. techcommunity.microsoft.com/ — Microsoft Tech Community
  4. Recent conference talks (Ignite, Build) transcripts
  5. Fabric CAT team blogs and whitepapers

Step 4: Compare and Analyze

Create a comparison matrix:

| Best Practice | Source | Covered in Skill? | Notes |
|---------------|--------|-------------------|-------|
| Use starter pools for Livy sessions | MS Docs | ✅ Yes | Session Management |
| Enable adaptive query execution | Tech Community | ⚠️ Partial | Mentioned but no config example |
| Avoid SELECT * on large tables | MS Docs | ✅ Yes | Avoid section |
| Use V-Order for read-heavy workloads | Fabric Blog | ❌ No | Gap - should add |

Step 5: Generate Report

Produce a structured report:

## Best Practices Verification Report

**Skill:** spark-consumption-cli
**Date:** {current_date}
**Sources Consulted:** {list_of_sources}

### Summary

| Category | Count |
|----------|-------|
| ✅ Covered | X |
| ⚠️ Partial | Y |
| ❌ Gap | Z |

### Fully Covered Best Practices

1. **{Practice Name}** — {brief description}
   - Source: {URL}
   - Skill location: {section reference}

### Partially Covered (Needs Enhancement)

1. **{Practice Name}** — {what's missing}
   - Source: {URL}
   - Current coverage: {what exists}
   - Recommendation: {specific improvement}

### Gaps (Not Covered)

1. **{Practice Name}** — {description}
   - Source: {URL}
   - Why important: {impact}
   - Suggested addition: {where to add and what content}

### Recommendations

1. {Priority 1 recommendation}
2. {Priority 2 recommendation}
3. {Priority 3 recommendation}

Must

  • Always search with "Microsoft Fabric" in query — generic Spark/SQL results may not apply
  • Cite sources for every best practice identified
  • Read the actual skill content before comparing — don't assume
  • Distinguish between authoring and consumption — best practices differ
  • Check publication dates — prefer content from 2024-2025; flag older sources
  • Focus on actionable gaps — prioritize high-impact missing guidance

Prefer

  • Official Microsoft documentation over community posts
  • Recent content (last 12 months) over older articles
  • Fabric-specific guidance over generic Spark/SQL advice
  • Concrete, implementable recommendations over vague suggestions
  • Linking to common/ documents for foundational patterns
  • Grouping related gaps into single recommendations

Avoid

  • Generic cloud data platform advice not specific to Fabric
  • Outdated patterns (pre-GA Fabric guidance may be obsolete)
  • Nitpicking minor omissions — focus on material gaps
  • Recommending changes that duplicate content in common/ files
  • Suggesting additions that would exceed skill token budget (15K warning threshold)

Example Usage

User Prompt

"Check best practices for spark consumption"

Expected Response

## Best Practices Verification: spark-consumption-cli

**Sources consulted:**
- Microsoft Learn: Fabric Spark development best practices
- Fabric Blog: Optimizing Livy sessions
- Tech Community: Fabric Lakehouse query patterns

### Summary: 8 covered, 2 partial, 3 gaps

### ✅ Fully Covered
1. **Session reuse over creation** — Skill explicitly checks for idle sessions (Must section)
2. **Use starter pools** — Configured in session creation (Must section)
3. **Avoid large result sets without LIMIT** — In Avoid section

### ⚠️ Partial Coverage
1. **Adaptive query execution** — Mentioned in config but no explanation of when to enable/disable
   - Recommendation: Add brief guidance on AQE tuning for different workload types

### ❌ Gaps
1. **Caching strategies for repeated queries**
   - Source: [MS Learn - Spark caching](https://learn.microsoft.com/...)
   - Recommendation: Add section on `.cache()` and `.persist()` for iterative analysis

2. **Cross-workspace query patterns**
   - Source: [Fabric Blog - Multi-workspace analytics](https://blog.fabric.microsoft.com/...)
   - Recommendation: Add example of querying across workspaces with proper shortcuts

### Priority Recommendations
1. Add caching guidance to Prefer / Data Exploration section
2. Expand AQE documentation with workload-specific configs
3. Add cross-workspace query example

Integration with Quality Check

Run this skill after the quality-check skill:

  1. quality-check validates structure and compliance
  2. best-practices-check validates content quality and completeness

Together they ensure skills are both well-formed and comprehensive.

Limitations

  • Web search results may vary; re-run periodically
  • Some best practices are context-dependent — use judgment
  • New Fabric features may not have documented best practices yet
  • Enterprise-specific guidance (security, compliance) may require internal sources
Weekly Installs
21
GitHub Stars
301
First Seen
Today