fiftyone-create-notebook

Installation

SKILL.md

Create FiftyOne Notebooks

Key Directives
Complete Workflow
Notebook Structure Reference
Common Use Cases
Troubleshooting
Best Practices
Resources

Key Directives

ALWAYS follow these rules:

1. Determine notebook type first

Classify the user's request before anything else:

User Request Pattern	Type	Template
"getting started", "beginner", "intro", "first notebook"	Getting Started	GETTING-STARTED-TEMPLATES.md
"tutorial", "how to use X", "deep dive", "demonstrate X"	Tutorial	TUTORIAL-TEMPLATES.md
"recipe", "quick", "snippet", "how do I X"	Recipe	RECIPE-TEMPLATES.md
"full pipeline", "end to end", "ML pipeline", "complete workflow"	Full Pipeline	GETTING-STARTED-TEMPLATES.md with all stages

If ambiguous, ask the user.

2. Create the notebook file before adding cells

Use the Write tool to create a valid empty .ipynb file first:

{
  "nbformat": 4,
  "nbformat_minor": 2,
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "name": "python",
      "version": "3.10.0"
    }
  },
  "cells": []
}

Then use NotebookEdit with edit_mode: "insert" to add cells.

3. Follow FiftyOne code conventions

All generated code must use standard FiftyOne import aliases:

import fiftyone as fo
import fiftyone.zoo as foz
import fiftyone.brain as fob
import fiftyone.types as fot
from fiftyone import ViewField as F

See the "Code Pattern Sources" table below for full conventions.

4. Build notebooks cell-by-cell

Use NotebookEdit with edit_mode: "insert" for every cell. Build top-to-bottom using cell_id chaining: insert the first cell without cell_id (inserts at beginning), then read the notebook to get its cell_id, and insert each subsequent cell with cell_id set to the previous cell's ID. This ensures correct ordering.

Critical: Without cell_id, every insert goes to the beginning of the notebook, resulting in reversed cell order.

Never write the entire .ipynb JSON at once. Incremental cell insertion allows verification and correction.

5. Always include App visualization

Every notebook must include at least one cell with:

session = fo.launch_app(dataset)

FiftyOne's core value is visual exploration. Notebooks without App visualization miss the point.

6. Use zoo datasets when user has no data

For getting-started and tutorial notebooks, default to foz.load_zoo_dataset() so users can run the notebook without bringing their own data. For recipes and custom pipelines, support both zoo and user data.

7. Precede every code cell with a markdown cell

Never place two code cells in a row without a markdown cell in between. The markdown cell explains what the code does and why. This is critical for tutorials and getting-started guides. Recipes are exempt — consecutive code cells are permitted for brevity (imports + load data, core solution + verify).

8. Present outline before generating

Before creating any cells, draft a notebook outline showing:

Section headings
Cell types (markdown/code)
Brief description of each cell

Get user approval before generating.

9. Fetch current API documentation

When generating code, fetch https://docs.voxel51.com/llms.txt for the latest FiftyOne API patterns. This ensures generated code uses current APIs and avoids deprecated patterns.

10. Verify after generation

After all cells are inserted, read the notebook back with the Read tool to verify:

Cells are in correct order
Markdown/code alternation is maintained
Import aliases are correct
Narrative flow is logical

Complete Workflow

Phase 1: Requirements

Gather information before designing the notebook:

Classify notebook type — Use the decision matrix from Directive 1. Ask the user if ambiguous.
Determine domain and task — Ask or infer what ML task the notebook covers. Common domains include:
- Object Detection
- Image Classification
- Instance/Semantic Segmentation
- Embeddings & Similarity
- Data Curation (duplicates, outliers, annotation mistakes)
- Video analysis, 3D point clouds, or any other FiftyOne-supported workflow
Determine data source:
- Zoo dataset (recommended for tutorials): foz.load_zoo_dataset("coco-2017", ...)
- User's local data: fo.Dataset.from_dir(...)
- Hugging Face Hub: foz.load_zoo_dataset("https://huggingface.co/datasets/...", ...)
- Quickstart: foz.load_zoo_dataset("quickstart")
Determine pipeline stages:

Stage	Description	Always Include?
Data Loading	Load/create dataset	Yes
Exploration	App, stats, filtering	Yes
Brain Methods	Embeddings, uniqueness, duplicates	If relevant
Inference	Model predictions	If relevant
Evaluation	Metrics, TP/FP/FN analysis	If predictions + ground truth
Export	Save to format	Optional

Confirm with user — Present a summary:
- Notebook type: [Getting Started / Tutorial / Recipe / Full Pipeline]
- Domain: [Detection / Classification / ...]
- Data source: [Zoo dataset name / User path / HF URL]
- Pipeline stages: [Load, Explore, Infer, Evaluate, Export]
- Estimated cells: [N]

Phase 2: Design

Read the appropriate template doc as a structural guide (adapt to the user's domain — do not copy specific datasets, models, or field names from the examples):
- GETTING-STARTED-TEMPLATES.md for getting-started and full pipeline
- TUTORIAL-TEMPLATES.md for tutorials
- RECIPE-TEMPLATES.md for recipes
Fetch API documentation — Fetch https://docs.voxel51.com/llms.txt using the WebFetch tool for current FiftyOne API patterns. This is the authoritative source for SDK usage.
Draft notebook outline — Create a numbered list of cells with cell number, type (markdown/code), and content summary:

0. [markdown] Title + description
1. [markdown] What You Will Learn
2. [code] pip install
3. [markdown] ## Setup
4. [code] Imports
5. [markdown] ## Load Dataset
6. [code] Load from zoo + print dataset info
...

Present outline for approval — Show the outline to the user. Wait for approval before generating.

Phase 3: Generation

Create empty notebook file:

# Use Write tool
Write(
    file_path="/path/to/notebook.ipynb",
    content='{"nbformat": 4, "nbformat_minor": 2, "metadata": {"kernelspec": {"display_name": "Python 3", "language": "python", "name": "python3"}, "language_info": {"name": "python", "version": "3.10.0"}}, "cells": []}'
)

Ask the user for the notebook file path, or suggest a default based on the title.

Add cells with NotebookEdit using cell_id chaining:

Insert the first cell without cell_id (goes to beginning), then read the notebook to get the assigned cell_id. Each subsequent cell uses cell_id of the previous cell:

# First cell — no cell_id, inserts at beginning
NotebookEdit(
    notebook_path="/path/to/notebook.ipynb",
    new_source="# Title\n\nDescription paragraph.",
    cell_type="markdown",
    edit_mode="insert"
)

# Read notebook to get the first cell's ID (e.g., "cell-0")
Read(file_path="/path/to/notebook.ipynb")

# Second cell — chain after cell-0
NotebookEdit(
    notebook_path="/path/to/notebook.ipynb",
    cell_id="cell-0",
    new_source="!pip install -q fiftyone",
    cell_type="code",
    edit_mode="insert"
)

# Third cell — chain after cell-1
NotebookEdit(
    notebook_path="/path/to/notebook.ipynb",
    cell_id="cell-1",
    new_source="import fiftyone as fo",
    cell_type="code",
    edit_mode="insert"
)

Continue chaining: each new cell's cell_id = previous cell's ID (cell-0, cell-1, cell-2, ...).

Cell content guidelines:

Keep code cells under 15 lines
One concept per code cell
Include inline comments for non-obvious operations
Use print() liberally so users see output
Include # Expected output: comments where helpful

Review during generation — After every 5-10 cells, briefly verify the notebook is building correctly. Read it back if needed.

Phase 4: Validation

Read notebook back — Use the Read tool to read the entire .ipynb file. Verify:
- Total cell count matches outline
- Cell order is correct
- No missing sections
Verify code quality — Check all code cells for:
- Correct FiftyOne import aliases (fo, foz, fob, fot, F)
- No deprecated APIs
- Consistent variable names throughout
- dataset variable used consistently
- Field names match between inference, evaluation, and filtering cells
Verify narrative flow — Check markdown cells for:
- Logical progression from section to section
- No jargon without explanation (for getting-started)
- Each markdown cell explains the why, not just the what
Execute the notebook — Create an isolated virtual environment and run end-to-end with papermill to avoid modifying the user's system Python:
```
python -m venv .notebook-test-env
.notebook-test-env/bin/pip install -q papermill ipykernel anywidget
.notebook-test-env/bin/python -m ipykernel install --user --name python3 --display-name "Python 3"
.notebook-test-env/bin/papermill notebook.ipynb notebook_output.ipynb
```
Fix any runtime errors, then re-run until all cells pass. Clean up after: rm -rf .notebook-test-env. Common issues:
- session.view requires a DatasetView, not a Dataset — use dataset.view()
- Missing packages — add them to the notebook's install cell
- Plotly widgets need anywidget in headless environments
Present summary — Tell the user:
- Notebook path
- Total cells (N markdown + N code)
- Sections covered
- Execution status (all cells passed / errors fixed)
- How to run it (e.g., jupyter notebook path/to/notebook.ipynb)

Notebook Structure Reference

See NOTEBOOK-STRUCTURE.md for detailed cell structure patterns for each pipeline stage, including cell shapes and markdown patterns.

Code Pattern Sources

For actual code patterns, fetch https://docs.voxel51.com/llms.txt as the authoritative source. Related skills provide additional context for specific pipeline stages:

Pipeline Stage	Related Skill
Imports & conventions	fiftyone-code-style
Data loading	fiftyone-dataset-import
Inference	fiftyone-dataset-inference
Evaluation	fiftyone-model-evaluation
Export	fiftyone-dataset-export
Embeddings & visualization	fiftyone-embeddings-visualization
Duplicates	fiftyone-find-duplicates

Common Use Cases

Use Case 1: Getting Started with Object Detection

User says: "Create a getting-started notebook for object detection"

Notebook outline (21 cells):

Cell	Type	Content
0	markdown	`# Getting Started with Object Detection in FiftyOne`
1	markdown	What You Will Learn (bullet list)
2	code	`!pip install fiftyone ultralytics`
3	markdown	## Setup
4	code	Imports (`fo`, `foz`, `F`)
5	markdown	## Load Dataset
6	code	`dataset = foz.load_zoo_dataset(...)` + `print(dataset)` + `print(dataset.first())`
7	markdown	## Explore in the FiftyOne App
8	code	`session = fo.launch_app(dataset)`
9	markdown	## Understand the Data
10	code	`dataset.count_values("ground_truth.detections.label")`
11	markdown	## Run Model Inference
12	code	`model = foz.load_zoo_model(...)` + `dataset.apply_model(...)` + `session.view = dataset.view()`
13	markdown	## Evaluate Predictions
14	code	`dataset.evaluate_detections(...)` + `results.print_report()`
15	markdown	## Analyze Errors
16	code	Evaluation patches: `dataset.to_evaluation_patches("eval")`, filter to FP
17	markdown	## Export for Training
18	code	`dataset.export(...)` to YOLOv5 format
19	markdown	## Conclusion + Next Steps
20	code	Cleanup: `fo.delete_dataset(...)`

Use Case 2: Tutorial - Finding Annotation Mistakes

User says: "Write a tutorial on finding annotation mistakes with FiftyOne"

Notebook outline (29 cells):

Phase	Cells	Content
Introduction	0-2	Title, problem statement (why annotation quality matters), learning goals
Setup	3-4	pip install, imports
Data	5-8	Load detection dataset, inspect schema, explore class distribution, launch App
Concept	9-10	Explain mistakenness: what it is, how it works, why embeddings help
Compute	11-14	Compute embeddings, compute mistakenness, view mistakenness distribution
Explore	15-18	Sort by mistakenness, view worst samples in App, tag suspicious annotations
Hardness	19-22	Compute hardness, compare with mistakenness, find ambiguous samples
Action	23-25	Filter to flagged samples, export for re-annotation
Conclusion	26-28	Summary, key takeaways, next steps, cleanup

Use Case 3: Recipe - Export to COCO Format

User says: "Quick recipe to export my dataset to COCO format"

Notebook outline (7 cells):

Cell	Type	Content
0	markdown	`# Export a FiftyOne Dataset to COCO Format` + one sentence description
1	code	`import fiftyone as fo` + `import fiftyone.types as fot`
2	code	`dataset = fo.load_dataset("my-dataset")`
3	markdown	Brief explanation of COCO format
4	code	`dataset.export(export_dir="/tmp/coco-export", dataset_type=fot.COCODetectionDataset, label_field="ground_truth")`
5	code	Verify: `!ls /tmp/coco-export/`
6	markdown	Variations: export with filters, export labels only, export to YOLOv5

Use Case 4: Full ML Pipeline

User says: "Create a complete ML pipeline notebook"

Notebook outline (31 cells):

Phase	Cells	Content
Title	0-1	Title + learning goals
Setup	2-3	pip install + imports
Load	4-6	Load dataset, inspect, print schema
Explore	7-10	Launch App, class distribution, sample statistics, filtering
Deduplicate	11-13	Compute embeddings, find near-duplicates, remove duplicates
Infer	14-16	Load model, apply to dataset, view predictions in App
Evaluate	17-21	Run evaluation, print report, confusion matrix, PR curves, patches
Visualize	22-24	Compute UMAP visualization, explore embedding space, find clusters
Export	25-27	Export curated dataset, export to training format
Conclusion	28-30	Summary, next steps, cleanup

Troubleshooting

Error: "Cells appear in wrong order"

Cause: Inserting cells without cell_id — each insert without cell_id goes to the beginning of the notebook, resulting in reversed order
Solution: Use cell_id chaining (see Directive 4). Insert first cell without cell_id, read notebook to get its ID, then chain each subsequent cell after the previous one. Use edit_mode: "replace" to fix individual cells.

Error: "Empty notebook after generation"

Cause: Write tool did not create valid JSON structure
Solution: Verify the minimal .ipynb JSON includes "cells": [] and "nbformat": 4

Error: "Import errors in generated code"

Cause: Using deprecated or incorrect FiftyOne APIs
Solution: Fetch https://docs.voxel51.com/llms.txt before generating code. Use standard aliases: fo, foz, fob, fot, F. Check that pip install cell includes all required packages.

Error: "Generated code uses deprecated APIs"

Cause: Stale API patterns
Solution: Fetch https://docs.voxel51.com/llms.txt for current API. Common deprecation: dataset.count("field") → dataset.count_values("field")

Error: "Notebook doesn't render in Jupyter"

Cause: Invalid .ipynb JSON structure
Solution: Verify "nbformat": 4 is set and "cells" is an array, not null

Best Practices

Keep code cells short - Under 15 lines, one concept per cell
Use markdown liberally - Every code cell should have a markdown cell explaining it
Show output - Use print() and display so users see intermediate results
Include App visualization - At least one fo.launch_app() call per notebook
Use zoo datasets for demos - Makes notebooks immediately runnable
Include cleanup - Add a cleanup cell at the end to delete temporary datasets
Link to docs - Include links to relevant FiftyOne documentation
Progressive complexity - Start simple, build to complex
Explain the "why" - Don't just show code; explain why each step matters
Test your outline first - Get user approval on the structure before generating 30+ cells

Resources

FiftyOne Documentation
FiftyOne LLM Docs - Fetch for comprehensive API reference
FiftyOne Tutorials
FiftyOne Recipes
FiftyOne Model Zoo
FiftyOne User Guide

Related skills

More from voxel51/fiftyone-skills

Installs

Repository

voxel51/fiftyone-skills

GitHub Stars

First Seen

Mar 11, 2026

Security Audits

Gen Agent Trust HubPass

SocketFail

SnykWarn