Storyteller Engine by smouj/storyteller-engine-skill

Purpose

Storyteller Engine transforms raw business data into compelling, human-readable narratives for executive reports, marketing materials, and data journalism. Real-world applications include:

Quarterly Business Reviews: Convert sales metrics, churn rates, and revenue data into executive summary narratives with trend analysis
Customer Insights: Transform NPS scores, survey responses, and demographic data into customer persona stories
Operational Reports: Generate daily/weekly operational summaries from incident logs, system metrics, and performance data
Financial Narratives: Create earnings call prep documents from financial statements and KPI dashboards
Marketing Performance: Turn campaign analytics (CTR, conversion rates, ROI) into stakeholder-ready performance stories

Scope

Core Commands

storyteller generate [OPTIONS]

Generates narrative from input data file. Options:

--input PATH: Path to CSV/JSON/Excel file (required)
--template NAME: Template name from template library or path to custom Jinja2 template
--output PATH: Output file path (default: stdout)
--format {markdown,html,pdf,txt}: Output format (default: markdown)
--tone {professional,casual,narrative,technical}: Writing style (default: professional)
--context TEXT: Additional context to include in prompt
--max-length INT: Maximum word count (default: 1000)
--temperature FLOAT: AI creativity 0.0-2.0 (default: 0.7)
--role {executive,marketing,technical,general}: Audience persona

storyteller validate [OPTIONS]

Validates input data against schema requirements. Options:

--input PATH: Data file to validate
--schema PATH: JSON Schema definition (default: auto-detect)
--strict: Fail on missing optional fields

storyteller templates list

Lists available built-in templates with metadata.

storyteller templates show NAME

Displays template content and required fields.

storyteller config set KEY VALUE

Sets configuration in ~/.storyteller/config.yaml

storyteller --version

Shows version and active model configuration.

Detailed Work Process

1. Data Preparation

Input data must be structured (CSV, JSON, or Excel) with column headers matching template field requirements. For a sales narrative template requiring quarter, revenue, growth_rate, region, your CSV must contain these exact column names.

Example data file (sales_data.csv):

quarter,revenue,growth_rate,region
Q1-2023,1250000,0.12,North America
Q2-2023,1380000,0.104,North America

2. Template Selection

Built-in templates are stored in ~/.storyteller/templates/ or /usr/local/share/storyteller/templates/. Custom templates are Jinja2 files with special {{ variable }} placeholders corresponding to data columns.

Template structure:

# {{ title }}
## {{ subtitle }}

**Period**: {{ quarter }}

Revenue performance for {{ region }} reached ${{ revenue }} in {{ quarter }}, representing a {{ growth_rate|percent }} increase over the previous period.

3. Execution

The engine:

Loads and validates input data using Pydantic schemas
Renders the Jinja2 template with data rows (aggregates if multiple rows)
Constructs an OpenAI prompt with template output + context + tone instructions
Calls the OpenAI API with configured model and temperature
Post-processes response (truncation, formatting)
Returns final narrative

4. Output Formats

Markdown: Clean, formatted text (default)
HTML: Includes basic styling and can be converted to PDF via pandoc
PDF: Requires pandoc and wkhtmltopdf installed
Text: Plain text, stripped of formatting

Golden Rules

Schema Alignment: Column headers in input data must match field names in template. Case-sensitive. Use storyteller validate before generation.
Token Budgeting: Large datasets cause token overflow. Filter data beforehand or use templates with aggregation filters like {{ revenue|sum }}.
Context Injection: The --context flag adds crucial background the AI needs. Without it, narratives are generic. Example: --context "Target audience: C-level executives. Focus on ROI and strategic implications."
Temperature Discipline:
- 0.0-0.3: Consistent, repetitive (good for standardized reports)
- 0.4-0.7: Balanced creativity (recommended)
- 0.8-1.2: Varied, engaging (use with context)
- >1.2: Unpredictable, may hallucinate (avoid for production)
Template Testing: Always test templates with storyteller templates show TEMPLATE and storyteller validate before full generation.
Rate Limiting: The engine respects OpenAI rate limits automatically but add --batch-size N for large files to process in chunks.
API Key Management: Never pass OPENAI_API_KEY in command line. Use environment variable or storyteller config set openai_key.

Examples

Example 1: Executive Sales Summary

STORYTELLER_MODEL=gpt-4-turbo storyteller generate \
  --input q4_sales.csv \
  --template executive_summary \
  --output exec_summary.md \
  --tone professional \
  --role executive \
  --context "Year-end report for board meeting. Emphasize YoY growth and regional highlights. Include forward-looking statement." \
  --max-length 800

Input (q4_sales.csv):

quarter,revenue,growth_rate,region,product_category
Q4-2023,4500000,0.18,North America,Enterprise
Q4-2023,3200000,0.12,Europe,SMB

Output (exec_summary.md):

# Q4 2023 Executive Summary

North America led revenue performance with $4.5M in Q4, achieving 18% growth driven by Enterprise segment expansion. Europe showed steady 12% growth totaling $3.2M, with SMB category outperforming expectations. Overall quarterly revenue surpassed targets by 7%, positioning the company for strong Q1 2024. Recommended focus: capitalize on North American Enterprise momentum and replicate success in European markets.

Example 2: Customer Survey Narrative

storyteller generate \
  --input survey_responses.json \
  --template customer_persona \
  --format markdown \
  --tone narrative \
  --context "Create 3 distinct customer personas based on satisfaction scores and feedback. Use vivid language." \
  --max-length 1500

Input (survey_responses.json):

[
  {"segment": "Enterprise", "nps": 9, "feedback": "Reliable, scales well", "usage_months": 24},
  {"segment": "Startup", "nps": 7, "feedback": "Affordable but limited features", "usage_months": 3}
]

Output (excerpt):

## Persona: The Scaling Enterprise

"The platform has become integral to our infrastructure—it simply works at scale." This Enterprise customer, with two years of usage, awarded a perfect NPS of 9. Their feedback highlights reliability as the key driver...

Example 3: Operations Log to Daily Report

storyteller generate \
  --input incidents_2024-03-07.csv \
  --template ops_daily \
  --output daily_ops.html \
  --format html \
  --tone technical \
  --role engineering \
  --context "Audience: engineering leadership. Include severity breakdown and MTTR." \
  --max-length 1200

Input (incidents_2024-03-07.csv):

timestamp,severity,service,resolution_time_minutes,description
2024-03-07 02:15:00,critical,auth-service,45,TLS cert expired
2024-03-07 08:30:00,high,api-gateway,12,Rate limiter misconfiguration

Output (daily_ops.html):

<h1>Daily Operations Report: March 7, 2024</h1>
<p><strong>Total Incidents:</strong> 2 (1 critical, 1 high)</p>
<p>The day began with a critical outage in the authentication service at 02:15 UTC caused by an expired TLS certificate. The issue was resolved within 45 minutes. A secondary high-severity incident in the API gateway at 08:30 UTC stemmed from a rate limiter misconfiguration and was contained within 12 minutes. <strong>Mean Time to Resolution (MTTR):</strong> 28.5 minutes.</p>

Rollback Commands

If generated output is unsatisfactory or corrupted:

storyteller rollback last

Reverts the most recent generate operation by restoring the previous output file from backup (.storyteller-backup-<timestamp>.md).

storyteller rollback --output PATH

Specifically rolls back the output at PATH if it was generated by storyteller (checks embedded metadata footer).

storyteller rollback --all --date YYYY-MM-DD

Mass rollback: removes all outputs generated on the specified date and restores from backups if available.

Note: Rollback works only if --output was used (not stdout). Backups are automatically created during generation unless --no-backup is passed.

Dependencies & Requirements

Python: 3.9+ with pip
OpenAI API: Valid API key with GPT-4 or GPT-3.5-turbo access
Data Files: CSV must be UTF-8 encoded; JSON must be array of objects; Excel requires openpyxl
Templates: Jinja2 syntax; special filters available: |percent, |currency, |titlecase, |aggregate:sum/avg/count
Disk Space: ~100MB for installation; templates typically <50KB each
Network: Outbound HTTPS to OpenAI API (status.openai.com check on startup)

Installation:

pip install storyteller-engine[full]
# for CSV/Excel support:
pip install storyteller-engine[data]

Configuration:

storyteller config set openai_key sk-...
storyteller config set default_model gpt-4-turbo
storyteller config set template_dir ~/.mytemplates

Verification Steps

After generation, verify with:

# Check output file is not empty and has expected keywords
grep -q "revenue" exec_summary.md && echo "PASS" || echo "FAIL"

# Validate against template field coverage
storyteller validate --input q4_sales.csv --template executive_summary

# Token usage estimation (dry-run)
storyteller generate --input data.csv --template temp --dry-run
# Shows prompt tokens, max completion tokens, estimated cost

# Schema check only
storyteller validate --input data.csv --strict

Expected exit codes:

0: Success (narrative generated and passed quality checks)
1: Validation failed (data doesn't match template)
2: API error (OpenAI unreachable or quota exceeded)
3: Template error (syntax error or missing required variable)
4: Output already exists (use --force to overwrite)

Troubleshooting

Error: "Template variable 'revenue' not found in data"

Cause: Column header mismatch between CSV and template.
Fix: Ensure CSV column names exactly match template {{ revenue }}. Use storyteller templates show NAME to see required fields.

Error: "Context length exceeded"

Cause: Input data too large for model's context window.
Fix: Filter data to essential rows; use aggregation in template ({{ revenue|sum }} instead of iterating). Or use --max-length to reduce output size.

Output Generic / Hallucinated

Cause: --context too vague or missing; temperature too high.
Fix: Provide specific context with --context "Focus on Q4 2023, exclude Q3 comparisons". Lower --temperature to 0.4-0.6.

Rate Limit Errors

Cause: Too many rapid requests.
Fix: Add --batch-size 5 to process 5 rows at a time. Or implement exponential backoff with storyteller config set retry_policy exponential.

No Output File Created

Cause: Output path directory doesn't exist.
Fix: Create directory first: mkdir -p $(dirname output.md). Or use absolute path.

Rollback Not Working

Cause: Backup file missing (was --no-backup used or file manually deleted?).
Fix: Rollbacks require backup files. Regular cleanup removes backups older than 30 days. Restore from version control if available.

Validation Fails on Optional Field

Cause: Template uses {{ optional_field }} but data missing column.
Fix: Either add column to data or modify template to use {% if optional_field %}...{% endif %}. Use --strict only for required fields.