matplotlib
SKILL.md
Matplotlib - Data Visualization
The most widely used library for 2D (and basic 3D) plotting. It provides full control over every element of a figure, from line styles to axis spines.
When to Use
- Creating publication-quality 2D plots (Line, Scatter, Bar, Hist)
- Visualizing scientific data (Heatmaps, Contours, Vector fields)
- Generating complex multi-panel figures
- Fine-tuning plots for papers/reports (LaTeX support)
- Building custom visualization tools and dashboards
- Plotting data directly from NumPy arrays or Pandas DataFrames
Reference Documentation
Official docs: https://matplotlib.org/stable/index.html
Gallery: https://matplotlib.org/stable/gallery/index.html (Essential for finding examples)
Search patterns: plt.subplots, ax.set_title, ax.legend, plt.savefig, matplotlib.colors
Core Principles
Two Interfaces: Choose Wisely
| Interface | Method | Use Case |
|---|---|---|
| Object-Oriented (OO) | fig, ax = plt.subplots() |
Recommended. Best for complex, reproducible plots. |
| Pyplot (State-based) | plt.plot(x, y) |
Quick interactive checks. Avoid for scripts/modules. |
Use Matplotlib For
- High-level control over figure layout.
- Precise styling for publication.
- Embedding plots in GUI applications.
Do NOT Use For
- Interactive web dashboards (use Plotly or Bokeh).
- Rapid statistical exploration (use Seaborn — it's built on Matplotlib but simpler for stats).
- Very large datasets (>1M points) in real-time (use Datashader or VisPy).
Quick Reference
Installation
pip install matplotlib
Standard Imports
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from matplotlib import gridspec
Basic Pattern - The OO Interface (The "Proper" Way)
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0, 10, 100)
y = np.sin(x)
# 1. Create Figure and Axis objects
fig, ax = plt.subplots(figsize=(8, 5))
# 2. Plot data
ax.plot(x, y, label='Sine Wave', color='tab:blue', linewidth=2)
# 3. Customize
ax.set_xlabel('Time (s)')
ax.set_ylabel('Amplitude')
ax.set_title('Oscillation Example')
ax.legend()
ax.grid(True, linestyle='--')
# 4. Show or Save
plt.show()
# fig.savefig('plot.pdf', dpi=300, bbox_inches='tight')
Critical Rules
✅ DO
- Use the OO interface (
ax.method()) - It prevents errors in multi-plot scripts. - Use
bbox_inches='tight'- When saving, to ensure labels aren't cut off. - Set dpi - Use 300+ for print, 72-100 for web.
- Close figures - Use
plt.close('all')in loops to avoid memory leaks. - Label everything - Every axis must have a label and units.
- Vector formats - Save as
.pdfor.svgfor academic papers (lossless scaling). - Colorblind-friendly - Use
tab10orviridiscolormaps.
❌ DON'T
- Mix
plt.andax.- It leads to "hidden state" bugs. - Use
plt.show()in loops - It blocks execution; usefig.savefig()instead. - Manual legend placement - Let
ax.legend(loc='best')try first. - Hardcode font sizes - Use
plt.rcParams.update({'font.size': 12})for consistency. - Use "Rainbow" (Jet) - It creates false gradients; use perceptually uniform maps like
magmaorinferno.
Anti-Patterns (NEVER)
# ❌ BAD: Mixing interfaces (State-based + OO)
plt.figure()
ax = plt.gca()
plt.plot(x, y) # Confusing state
ax.set_title('Test')
# ✅ GOOD: Consistent OO interface
fig, ax = plt.subplots()
ax.plot(x, y)
ax.set_title('Test')
# ❌ BAD: Overlapping subplots
fig, axs = plt.subplots(2, 2)
# Plots look squashed and titles overlap
# ✅ GOOD: Use constrained_layout or tight_layout
fig, axs = plt.subplots(2, 2, constrained_layout=True)
Anatomy of a Plot
Labels, Ticks, and Styles
fig, ax = plt.subplots()
ax.plot(x, y, 'o-', color='red', markersize=4, alpha=0.7)
# Explicitly setting limits
ax.set_xlim(0, 10)
ax.set_ylim(-1.5, 1.5)
# Controlling Ticks
ax.set_xticks([0, 2.5, 5, 7.5, 10])
ax.set_xticklabels(['Start', '1/4', 'Mid', '3/4', 'End'])
# Spines (Box around the plot)
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
# Adding text and arrows
ax.annotate('Local Max', xy=(1.5, 1), xytext=(3, 1.2),
arrowprops=dict(facecolor='black', shrink=0.05))
Advanced Layouts
Subplots and GridSpec
# Simple 2x2 grid
fig, axs = plt.subplots(2, 2, figsize=(10, 10))
axs[0, 0].plot(x, y) # Top left
axs[1, 1].scatter(x, y) # Bottom right
# Complex grid (Uneven sizes)
fig = plt.figure(figsize=(10, 6))
gs = gridspec.GridSpec(2, 2, width_ratios=[2, 1], height_ratios=[1, 2])
ax1 = fig.add_subplot(gs[0, 0]) # Top left (large width)
ax2 = fig.add_subplot(gs[0, 1]) # Top right
ax3 = fig.add_subplot(gs[1, :]) # Bottom spanning all columns
Scientific Plot Types
Heatmaps and Colorbars
data = np.random.rand(10, 10)
fig, ax = plt.subplots()
im = ax.imshow(data, cmap='viridis', interpolation='nearest')
# Add colorbar
cbar = fig.colorbar(im, ax=ax, label='Intensity [a.u.]')
# Proper alignment of colorbar
from mpl_toolkits.axes_grid1 import make_axes_locatable
divider = make_axes_locatable(ax)
cax = divider.append_axes("right", size="5%", pad=0.05)
fig.colorbar(im, cax=cax)
Histograms and Error Bars
# Histogram
data = np.random.normal(0, 1, 1000)
ax.hist(data, bins=30, density=True, alpha=0.6, color='g', edgecolor='black')
# Error bars
x = np.arange(10)
y = x**2
yerr = np.sqrt(y)
ax.errorbar(x, y, yerr=yerr, fmt='o', capsize=5, label='Data with noise')
3D Plotting
from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
X = np.arange(-5, 5, 0.25)
Y = np.arange(-5, 5, 0.25)
X, Y = np.meshgrid(X, Y)
R = np.sqrt(X**2 + Y**2)
Z = np.sin(R)
surf = ax.plot_surface(X, Y, Z, cmap='coolwarm', linewidth=0, antialiased=False)
fig.colorbar(surf, shrink=0.5, aspect=5)
Formatting for Publication
Using LaTeX and RcParams
# Global styling
plt.style.use('seaborn-v0_8-paper') # or 'ggplot', 'bmh'
# LaTeX for labels
plt.rcParams.update({
"text.usetex": True,
"font.family": "serif",
"font.serif": ["Computer Modern Roman"],
"axes.labelsize": 14,
})
fig, ax = plt.subplots()
ax.plot(x, y)
ax.set_xlabel(r'$\alpha_{i} + \beta \sin(\omega t)$') # LaTeX string
Practical Workflows
1. Multi-dataset Comparison Workflow
def plot_comparison(datasets, labels):
fig, ax = plt.subplots(figsize=(10, 6))
colors = plt.cm.viridis(np.linspace(0, 1, len(datasets)))
for data, label, color in zip(datasets, labels, colors):
ax.plot(data['x'], data['y'], label=label, color=color, lw=1.5)
ax.fill_between(data['x'], data['y']-data['std'], data['y']+data['std'],
alpha=0.2, color=color)
ax.set_title('Experiment Results Comparison')
ax.legend(frameon=False)
return fig, ax
2. Monitoring Real-time Data (Interactive)
# Use this in a Jupyter environment or script
plt.ion() # Interactive mode on
fig, ax = plt.subplots()
line, = ax.plot([], [])
for i in range(100):
new_data = np.random.rand(10)
line.set_data(np.arange(len(new_data)), new_data)
ax.relim()
ax.autoscale_view()
fig.canvas.draw()
fig.canvas.flush_events()
plt.pause(0.1)
3. Creating a Cluster Map / Correlation Matrix
import pandas as pd
df = pd.DataFrame(np.random.rand(10, 4), columns=['A', 'B', 'C', 'D'])
corr = df.corr()
fig, ax = plt.subplots()
im = ax.imshow(corr, cmap='RdBu_r', vmin=-1, vmax=1)
ax.set_xticks(np.arange(len(corr.columns)), labels=corr.columns)
ax.set_yticks(np.arange(len(corr.index)), labels=corr.index)
# Loop over data dimensions and create text annotations.
for i in range(len(corr.index)):
for j in range(len(corr.columns)):
text = ax.text(j, i, f"{corr.iloc[i, j]:.2f}",
ha="center", va="center", color="black")
Performance Optimization
Plotting Large Data
# 1. Use 'agg' backend for non-interactive rendering
import matplotlib
matplotlib.use('Agg')
# 2. Use PathCollection for scatter plots with many points
ax.scatter(x, y, s=1) # slow for 1M points
# 3. Use marker='' (none) and only lines for speed
ax.plot(x, y, marker=None)
# 4. Decimate data before plotting
ax.plot(x[::10], y[::10]) # Plot every 10th point
Common Pitfalls and Solutions
Date/Time Axis issues
# ❌ Problem: Dates look like a black blob
# ✅ Solution: Use AutoDateLocator and AutoDateFormatter
import matplotlib.dates as mdates
fig, ax = plt.subplots()
ax.plot(dates, values)
ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m'))
fig.autofmt_xdate() # Rotates labels
Multiple Legends on one plot
# ❌ Problem: Calling ax.legend() twice replaces the first one
# ✅ Solution: Manually add the first artist back
fig, ax = plt.subplots()
line1, = ax.plot([1, 2], [1, 2], label='Line 1')
line2, = ax.plot([1, 2], [2, 1], label='Line 2')
first_legend = ax.legend(handles=[line1], loc='upper left')
ax.add_artist(first_legend) # Add back
ax.legend(handles=[line2], loc='lower right')
Image Saving Quality (Clipping)
# ❌ Problem: Legend or Axis title is cut off in the .png file
# ✅ Solution:
fig.savefig('output.png', bbox_inches='tight')
Best Practices
- Always use the OO interface (
fig, ax = plt.subplots()) for scripts and modules - Save figures with appropriate formats - Use PDF/SVG for publications, PNG for web
- Set DPI appropriately - 300+ for print, 72-100 for screen
- Use
bbox_inches='tight'when saving to prevent clipping - Close figures in loops to prevent memory leaks
- Use colorblind-friendly colormaps - Avoid 'jet', prefer 'viridis', 'plasma', 'inferno'
- Label all axes with descriptive names and units
- Use
constrained_layout=Truefor subplots to prevent overlap - Configure global styles with
plt.rcParamsfor consistency - Test plots at target resolution before finalizing
Weekly Installs
13
Repository
tondevrel/scien…t-skillsGitHub Stars
6
First Seen
Feb 8, 2026
Security Audits
Installed on
opencode12
github-copilot12
codex12
kimi-cli12
amp12
gemini-cli12