data-visualization
Data Visualization
This skill enables an AI agent to transform structured data into meaningful visual representations. The agent selects appropriate chart types based on the data and the question being asked, builds publication-quality static charts with matplotlib and seaborn, and creates interactive visualizations with plotly. It follows established data visualization principles to ensure clarity, accuracy, and visual appeal.
Workflow
-
Understand the data and the question. Examine the dataset's structure — how many variables, what types (numeric, categorical, temporal), and what relationship or comparison the user wants to highlight. The question drives chart selection more than the data alone.
-
Select the appropriate chart type. Match the analytical goal to the right visual form. Use bar charts for categorical comparisons, line charts for trends over time, scatter plots for relationships between two continuous variables, histograms for distributions, box plots for spread and outliers, and heatmaps for correlation matrices or dense categorical grids.
-
Prepare the data for plotting. Aggregate, pivot, or reshape the data as needed. Sort categorical axes by value for bar charts. Resample time-series to the right granularity. Ensure no NaN values leak into the plot that would create gaps or errors.
-
Build the visualization with appropriate styling. Apply consistent color palettes, readable axis labels, descriptive titles, and proper legends. Remove chart junk — unnecessary gridlines, borders, and decorations. Use figure sizes that match the intended output medium (report, slide, dashboard).
-
Add context and annotations. Highlight key data points with annotations, reference lines, or shaded regions. Add summary statistics directly on the chart where helpful (e.g., median line on a box plot, trend line on a scatter). Context turns a chart from decoration into analysis.
-
Export or display. Save static charts as PNG or SVG for reports, or render interactive HTML for dashboards and exploration. Set DPI to 150+ for print-quality output.
Supported Technologies
- matplotlib — foundational plotting library for full control over every visual element
- seaborn — statistical visualization with sensible defaults and built-in themes
- plotly — interactive charts with hover tooltips, zoom, and pan
- plotly.express — concise API for rapid interactive chart creation
When to Use Which Chart Type
| Goal | Chart Type | Library |
|---|---|---|
| Compare categories | Bar chart (vertical or horizontal) | matplotlib, seaborn |
| Show trend over time | Line chart | matplotlib, plotly |
| Explore relationship between 2 variables | Scatter plot | seaborn, plotly |
| Show distribution of a variable | Histogram or KDE | seaborn |
| Compare distributions across groups | Box plot or violin plot | seaborn |
| Display correlation matrix | Heatmap | seaborn |
| Show composition / proportions | Stacked bar or pie chart | matplotlib |
| Enable user exploration | Interactive chart | plotly |
Usage
Provide the agent with a dataset and a description of what you want to visualize. Optionally specify chart type, color preferences, output format, and figure dimensions. The agent will select the best approach if no chart type is specified.
Examples
Example 1: Sales dashboard with matplotlib and seaborn
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
df = pd.read_csv("quarterly_sales.csv", parse_dates=["date"])
sns.set_theme(style="whitegrid", palette="viridis")
fig, axes = plt.subplots(2, 2, figsize=(14, 10))
fig.suptitle("Q4 2024 Sales Dashboard", fontsize=16, fontweight="bold")
# 1. Monthly revenue trend
monthly = df.resample("M", on="date")["revenue"].sum()
axes[0, 0].plot(monthly.index, monthly.values, marker="o", linewidth=2)
axes[0, 0].set_title("Monthly Revenue Trend")
axes[0, 0].set_ylabel("Revenue ($)")
axes[0, 0].tick_params(axis="x", rotation=45)
# 2. Revenue by region (horizontal bar)
region = df.groupby("region")["revenue"].sum().sort_values()
axes[0, 1].barh(region.index, region.values, color=sns.color_palette("viridis", len(region)))
axes[0, 1].set_title("Revenue by Region")
axes[0, 1].set_xlabel("Total Revenue ($)")
# 3. Units sold distribution (histogram)
axes[1, 0].hist(df["units_sold"], bins=30, edgecolor="white", alpha=0.8)
axes[1, 0].axvline(df["units_sold"].median(), color="red", linestyle="--", label="Median")
axes[1, 0].set_title("Units Sold Distribution")
axes[1, 0].legend()
# 4. Revenue vs. discount scatter with regression
sns.regplot(data=df, x="discount", y="revenue", ax=axes[1, 1],
scatter_kws={"alpha": 0.4, "s": 15}, line_kws={"color": "red"})
axes[1, 1].set_title("Revenue vs. Discount")
plt.tight_layout()
plt.savefig("sales_dashboard.png", dpi=150, bbox_inches="tight")
plt.show()
Example 2: Interactive visualization with plotly
import pandas as pd
import plotly.express as px
df = pd.read_csv("global_sales.csv")
# Interactive scatter with size, color, and hover data
fig = px.scatter(
df,
x="marketing_spend",
y="revenue",
size="units_sold",
color="region",
hover_data=["product_name", "quarter"],
title="Marketing Spend vs Revenue by Region",
labels={
"marketing_spend": "Marketing Spend ($)",
"revenue": "Revenue ($)",
"units_sold": "Units Sold"
},
template="plotly_white"
)
fig.update_traces(marker=dict(opacity=0.7, line=dict(width=1, color="DarkSlateGrey")))
# Add a trend line annotation
fig.add_annotation(
x=45000, y=320000,
text="Strong ROI cluster:<br>low spend, high revenue",
showarrow=True, arrowhead=2,
font=dict(size=12, color="darkblue")
)
fig.write_html("interactive_scatter.html")
fig.show()
# Users can hover over points to see product_name and quarter,
# zoom into clusters, and toggle regions on/off via the legend.
Best Practices
- Choose chart type based on the analytical question, not aesthetics — a scatter plot that reveals no pattern is still the right choice if the question is about correlation.
- Limit color categories to 7 or fewer; beyond that, use faceting or small multiples instead of cramming more colors into a single legend.
- Always label axes with units and use human-readable number formats (e.g., "$1.2M" not "1200000").
- Start bar chart y-axes at zero to avoid exaggerating differences; line charts may use a truncated axis when the focus is on change rather than absolute values.
- Use colorblind-friendly palettes (viridis, cividis, or ColorBrewer qualitative sets) by default.
- Export at 150+ DPI for any chart that will appear in a document or presentation.
Edge Cases
- Too many categories for a single chart. If a bar chart would have more than 15 bars, show the top N and aggregate the rest into an "Other" category, or switch to a treemap.
- Overlapping points in scatter plots. Use transparency (
alpha=0.3), jitter, or hexbin/2D density plots when thousands of points overlap. - Long axis labels. Rotate labels 45 degrees, truncate with ellipsis, or switch to horizontal bar charts to keep text readable.
- Missing values creating gaps in line charts. Interpolate small gaps (1-2 points) linearly and mark them with a dashed segment. For larger gaps, break the line to avoid implying continuity.
- Extremely skewed data. Apply log-scale axes and note the transformation clearly in the axis label (e.g., "Revenue (log scale)").
More from seb1n/awesome-ai-agent-skills
summarization
Summarize text using extractive, abstractive, hierarchical, and multi-document techniques, producing concise outputs at configurable detail levels.
23note-taking
Capture, organize, and retrieve notes efficiently using structured formats, tagging, and file management for meetings, ideas, research, and daily logs.
18proofreading
Proofread and correct text for grammar, spelling, punctuation, style, clarity, and consistency, with support for multiple style guides and readability analysis.
17knowledge-graph-creation
Build structured knowledge graphs from unstructured text by extracting entities, mapping relationships, generating graph triples, and visualizing the result.
16technical-writing
Write clear, concise, and accurate technical documentation including API references, user guides, tutorials, changelogs, and architecture docs, tailored to the target audience.
14data-analysis
Analyze datasets to extract insights through statistical methods, trend identification, hypothesis testing, and correlation analysis.
14