data-visualizer
SKILL.md
Data Visualizer
Overview
Automated visualization generation for exploratory data analysis, model performance reporting, and stakeholder communication. Creates publication-quality plots, interactive dashboards, and business-friendly reports—all integrated with SpecWeave's increment workflow.
Visualization Categories
1. Exploratory Data Analysis (EDA)
Automated EDA Report:
from specweave import EDAVisualizer
visualizer = EDAVisualizer(increment="0042")
# Generates comprehensive EDA report
report = visualizer.generate_eda_report(df)
# Creates:
# - Dataset overview (rows, columns, memory, missing values)
# - Numerical feature distributions (histograms + KDE)
# - Categorical feature counts (bar charts)
# - Correlation heatmap
# - Missing value pattern
# - Outlier detection plots
# - Feature relationships (pairplot for top features)
Individual EDA Plots:
# Distribution plots
visualizer.plot_distribution(
data=df['age'],
title="Age Distribution",
bins=30
)
# Correlation heatmap
visualizer.plot_correlation_heatmap(
data=df[numerical_columns],
method='pearson' # or 'spearman', 'kendall'
)
# Missing value patterns
visualizer.plot_missing_values(df)
# Outlier detection (boxplots)
visualizer.plot_outliers(df[numerical_columns])
2. Model Performance Visualizations
Classification Performance:
from specweave import ClassificationVisualizer
viz = ClassificationVisualizer(increment="0042")
# Confusion matrix
viz.plot_confusion_matrix(
y_true=y_test,
y_pred=y_pred,
classes=['Negative', 'Positive']
)
# ROC curve
viz.plot_roc_curve(
y_true=y_test,
y_proba=y_proba
)
# Precision-Recall curve
viz.plot_precision_recall_curve(
y_true=y_test,
y_proba=y_proba
)
# Learning curves (train vs val)
viz.plot_learning_curve(
train_scores=train_scores,
val_scores=val_scores
)
# Calibration curve (are probabilities well-calibrated?)
viz.plot_calibration_curve(
y_true=y_test,
y_proba=y_proba
)
Regression Performance:
from specweave import RegressionVisualizer
viz = RegressionVisualizer(increment="0042")
# Predicted vs Actual
viz.plot_predictions(
y_true=y_test,
y_pred=y_pred
)
# Residual plot
viz.plot_residuals(
y_true=y_test,
y_pred=y_pred
)
# Residual distribution (should be normal)
viz.plot_residual_distribution(
residuals=y_test - y_pred
)
# Error by feature value
viz.plot_error_analysis(
y_true=y_test,
y_pred=y_pred,
features=X_test
)
3. Feature Analysis Visualizations
Feature Importance:
from specweave import FeatureVisualizer
viz = FeatureVisualizer(increment="0042")
# Feature importance (bar chart)
viz.plot_feature_importance(
feature_names=feature_names,
importances=model.feature_importances_,
top_n=20
)
# SHAP summary plot
viz.plot_shap_summary(
shap_values=shap_values,
features=X_test
)
# Partial dependence plots
viz.plot_partial_dependence(
model=model,
features=['age', 'income'],
X=X_train
)
# Feature interaction
viz.plot_feature_interaction(
model=model,
features=('age', 'income'),
X=X_train
)
4. Time Series Visualizations
Time Series Plots:
from specweave import TimeSeriesVisualizer
viz = TimeSeriesVisualizer(increment="0042")
# Time series with trend
viz.plot_timeseries(
data=sales_data,
show_trend=True
)
# Seasonal decomposition
viz.plot_seasonal_decomposition(
data=sales_data,
period=12 # Monthly seasonality
)
# Autocorrelation (ACF, PACF)
viz.plot_autocorrelation(data=sales_data)
# Forecast with confidence intervals
viz.plot_forecast(
actual=test_data,
forecast=forecast,
confidence_intervals=(0.80, 0.95)
)
5. Model Comparison Visualizations
Compare Multiple Models:
from specweave import ModelComparisonVisualizer
viz = ModelComparisonVisualizer(increment="0042")
# Compare metrics across models
viz.plot_model_comparison(
models=['Baseline', 'XGBoost', 'LightGBM', 'Neural Net'],
metrics={
'accuracy': [0.65, 0.87, 0.86, 0.85],
'roc_auc': [0.70, 0.92, 0.91, 0.90],
'training_time': [1, 45, 32, 320]
}
)
# ROC curves for multiple models
viz.plot_roc_curves_comparison(
models_predictions={
'XGBoost': (y_test, y_proba_xgb),
'LightGBM': (y_test, y_proba_lgbm),
'Neural Net': (y_test, y_proba_nn)
}
)
Interactive Visualizations
Plotly Integration:
from specweave import InteractiveVisualizer
viz = InteractiveVisualizer(increment="0042")
# Interactive scatter plot (zoom, pan, hover)
viz.plot_interactive_scatter(
x=X_test[:, 0],
y=X_test[:, 1],
colors=y_pred,
hover_data=df[['id', 'amount', 'merchant']]
)
# Interactive confusion matrix (click for details)
viz.plot_interactive_confusion_matrix(
y_true=y_test,
y_pred=y_pred
)
# Interactive feature importance (sortable, filterable)
viz.plot_interactive_feature_importance(
feature_names=feature_names,
importances=importances
)
Business Reporting
Automated ML Report:
from specweave import MLReportGenerator
generator = MLReportGenerator(increment="0042")
# Generate executive summary report
report = generator.generate_report(
model=model,
test_data=(X_test, y_test),
business_metrics={
'false_positive_cost': 5,
'false_negative_cost': 500
}
)
# Creates:
# - Executive summary (1 page, non-technical)
# - Key metrics (accuracy, precision, recall)
# - Business impact ($$ saved, ROI)
# - Model performance visualizations
# - Recommendations
# - Technical appendix
Report Output (HTML/PDF):
# Fraud Detection Model - Executive Summary
## Key Results
- **Accuracy**: 87% (target: >85%) ✅
- **Fraud Detection Rate**: 62% (catching 310 frauds/day)
- **False Positive Rate**: 38% (190 false alarms/day)
## Business Impact
- **Fraud Prevented**: $155,000/day
- **Review Cost**: $950/day (190 transactions × $5)
- **Net Benefit**: $154,050/day ✅
- **Annual Savings**: $56.2M
## Model Performance
[Confusion Matrix Visualization]
[ROC Curve]
[Feature Importance]
## Recommendations
1. ✅ Deploy to production immediately
2. Monitor fraud patterns weekly
3. Retrain model monthly with new data
Dashboard Creation
Real-Time Dashboard:
from specweave import DashboardCreator
creator = DashboardCreator(increment="0042")
# Create Grafana/Plotly dashboard
dashboard = creator.create_dashboard(
title="Model Performance Dashboard",
panels=[
{'type': 'metric', 'query': 'prediction_latency_p95'},
{'type': 'metric', 'query': 'predictions_per_second'},
{'type': 'timeseries', 'query': 'accuracy_over_time'},
{'type': 'timeseries', 'query': 'error_rate'},
{'type': 'heatmap', 'query': 'prediction_distribution'},
{'type': 'table', 'query': 'recent_anomalies'}
]
)
# Exports to Grafana JSON or Plotly Dash app
dashboard.export(format='grafana')
Visualization Best Practices
1. Publication-Quality Plots
# Set consistent styling
visualizer.set_style(
style='seaborn', # Or 'ggplot', 'fivethirtyeight'
context='paper', # Or 'notebook', 'talk', 'poster'
palette='colorblind' # Accessible colors
)
# High-resolution exports
visualizer.save_figure(
filename='model_performance.png',
dpi=300, # Publication quality
bbox_inches='tight'
)
2. Accessible Visualizations
# Colorblind-friendly palettes
visualizer.use_colorblind_palette()
# Add alt text for accessibility
visualizer.add_alt_text(
plot=fig,
description="Confusion matrix showing 87% accuracy"
)
# High contrast for presentations
visualizer.set_high_contrast_mode()
3. Annotation and Context
# Add reference lines
viz.add_reference_line(
y=0.85, # Target accuracy
label='Target',
color='red',
linestyle='--'
)
# Add annotations
viz.annotate_point(
x=optimal_threshold,
y=optimal_f1,
text='Optimal threshold: 0.47'
)
Integration with SpecWeave
Automated Visualization in Increments
# All visualizations auto-saved to increment folder
visualizer = EDAVisualizer(increment="0042")
# Creates:
# .specweave/increments/0042-fraud-detection/
# ├── visualizations/
# │ ├── eda/
# │ │ ├── distributions.png
# │ │ ├── correlation_heatmap.png
# │ │ └── missing_values.png
# │ ├── model_performance/
# │ │ ├── confusion_matrix.png
# │ │ ├── roc_curve.png
# │ │ ├── precision_recall.png
# │ │ └── learning_curves.png
# │ ├── feature_analysis/
# │ │ ├── feature_importance.png
# │ │ ├── shap_summary.png
# │ │ └── partial_dependence/
# │ └── reports/
# │ ├── executive_summary.html
# │ └── technical_report.pdf
Living Docs Integration
/sw:sync-docs update
Updates:
<!-- .specweave/docs/internal/architecture/ml-model-performance.md -->
## Fraud Detection Model Performance (Increment 0042)
### Model Accuracy

### Key Metrics
- Accuracy: 87%
- Precision: 85%
- Recall: 62%
- ROC AUC: 0.92
### Feature Importance

Top 5 features:
1. amount_vs_user_average (0.18)
2. days_since_last_purchase (0.12)
3. merchant_risk_score (0.10)
4. velocity_24h (0.08)
5. location_distance_from_home (0.07)
Commands
# Generate EDA report
/ml:visualize-eda 0042
# Generate model performance report
/ml:visualize-performance 0042
# Create interactive dashboard
/ml:create-dashboard 0042
# Export all visualizations
/ml:export-visualizations 0042 --format png,pdf,html
Advanced Features
1. Automated Report Generation
# Generate full increment report with all visualizations
generator = IncrementReportGenerator(increment="0042")
report = generator.generate_full_report()
# Includes:
# - EDA visualizations
# - Experiment comparisons
# - Best model performance
# - Feature importance
# - Business impact
# - Deployment readiness
2. Custom Visualization Templates
# Create reusable templates
template = VisualizationTemplate(name="fraud_analysis")
template.add_panel("confusion_matrix")
template.add_panel("roc_curve")
template.add_panel("top_fraud_features")
template.add_panel("fraud_trends_over_time")
# Apply to any increment
template.apply(increment="0042")
3. Version Control for Visualizations
# Track visualization changes across model versions
viz_tracker = VisualizationTracker(increment="0042")
# Compare model v1 vs v2 visualizations
viz_tracker.compare_versions(
version_1="model-v1",
version_2="model-v2"
)
# Shows: Confusion matrix improved, ROC curve comparison, etc.
Summary
Data visualization is critical for:
- ✅ Exploratory data analysis (understand data before modeling)
- ✅ Model performance communication (stakeholder buy-in)
- ✅ Feature analysis (understand what drives predictions)
- ✅ Business reporting (translate metrics to impact)
- ✅ Model debugging (identify issues visually)
This skill automates visualization generation, ensuring all ML work is visual, accessible, and business-friendly within SpecWeave's increment workflow.
Weekly Installs
8
Repository
anton-abyzov/specweaveInstalled on
claude-code7
opencode5
cursor5
codex5
antigravity5
gemini-cli5