plotly
Plotly - Interactive Visualization
Plotly provides a wide range of interactive charts. Its "Plotly Express" API is designed for speed and ease of use with tidy DataFrames, while "Graph Objects" offers low-level control over every trace and attribute.
When to Use
- Creating interactive charts for web applications or Jupyter notebooks
- Visualizing 3D data (surfaces, scatter, mesh)
- Geographic maps (scatter on maps, choropleths) with Mapbox integration
- Financial charts (candlestick, OHLC)
- Exploring large datasets where zooming into specific regions is required
- Creating animations (time-series sliders)
- Building production-ready dashboards (via Dash)
Reference Documentation
Official docs: https://plotly.com/python/
Plotly Express: https://plotly.com/python/plotly-express/
Search patterns: px.scatter, go.Figure, fig.update_layout, fig.write_html, px.choropleth
Core Principles
Plotly Express (px) vs. Graph Objects (go)
| Feature | Plotly Express (px) | Graph Objects (go) |
|---|---|---|
| Complexity | High-level, concise. | Low-level, verbose. |
| Data Format | Tidy (long-form) DataFrames. | Lists, Arrays, Dicts, or DataFrames. |
| Customization | Good (using update_*). | Maximum / Full control. |
| Speed of Dev | Very fast. | Slower. |
Use Plotly For
- Interactive exploration (hover, zoom)
- 3D and Geospatial visualization
- Exporting to standalone interactive HTML files
- Integration with Dash
Do NOT Use For
- Publication-quality static LaTeX plots (use Matplotlib)
- Very large static image generation (Matplotlib is faster)
- Low-memory environments (Plotly's JSON-based figures are memory-heavy)
Quick Reference
Installation
pip install plotly pandas
Standard Imports
import plotly.express as px
import plotly.graph_objects as go
import pandas as pd
import numpy as np
Basic Pattern - Plotly Express
import plotly.express as px
# Load data
df = px.data.iris()
# Create interactive scatter plot
fig = px.scatter(df, x="sepal_width", y="sepal_length",
color="species", size="petal_length",
hover_data=['petal_width'])
# Display
fig.show()
Critical Rules
✅ DO
- Use Plotly Express first - 90% of tasks are easier with px
- Prefer Tidy Data - Ensure one row per observation for easy mapping to colors/axes
- Use update_layout - Cleanly modify titles, fonts, and background colors
- Save as HTML - Use
fig.write_html("plot.html")to share interactive charts - Leverage Hover Data - Add context to points without cluttering the plot
- Set Figure Templates - Use
template="plotly_dark"or"ggplot2"for instant style - Use marginal_x/y - In px.scatter, quickly add histograms or boxplots to margins
❌ DON'T
- Pass huge datasets to the browser - Plotting >50k points can lag the UI; use datashader or decimation
- Manual looping with go - If px can do it, don't use a for-loop to add traces in go
- Forget to set axis labels - px uses column names; rename them in the DataFrame for better labels
- Over-animate - Smooth animations are cool, but too many moving parts distract from the data
Anti-Patterns (NEVER)
# ❌ BAD: Over-complicating a simple plot with Graph Objects
fig = go.Figure()
for species in df['species'].unique():
sub = df[df['species'] == species]
fig.add_trace(go.Scatter(x=sub['sepal_w'], y=sub['sepal_l'], name=species))
# ✅ GOOD: Use Plotly Express (One line, automatic legend/colors)
fig = px.scatter(df, x="sepal_width", y="sepal_length", color="species")
# ❌ BAD: Mixing list-style data with DataFrame-style data in px
px.scatter(x=[1,2,3], y=df['column']) # Can lead to alignment issues
# ✅ GOOD: Stick to the DataFrame
px.scatter(df, x="column_a", y="column_b")
Plotly Express (px) Deep Dive
Statistical Charts
# Boxplot with points
fig = px.box(df, x="day", y="total_bill", color="smoker", points="all")
# Violin plot with box inside
fig = px.violin(df, x="day", y="total_bill", color="sex", box=True, points="all")
# Heatmap (Density Contour)
fig = px.density_heatmap(df, x="total_bill", y="tip", marginal_x="histogram", marginal_y="histogram")
Time Series and Faceting
df = px.data.stocks()
# Multiple lines from wide data
fig = px.line(df, x='date', y=["GOOG", "AAPL", "AMZN"], title="Tech Stocks")
# Faceting (Subplots by category)
df = px.data.tips()
fig = px.scatter(df, x="total_bill", y="tip", color="smoker",
facet_col="day", facet_row="time")
3D Visualization
Scatter, Lines, and Surfaces
# 3D Scatter
fig = px.scatter_3d(df, x='sepal_length', y='sepal_width', z='petal_width', color='species')
# 3D Surface (Using Graph Objects)
z_data = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/api_docs/mt_bruno_elevation.csv')
fig = go.Figure(data=[go.Surface(z=z_data.values)])
fig.update_layout(title='Mt Bruno Elevation', autosize=False,
width=500, height=500, margin=dict(l=65, r=50, b=65, t=90))
Geospatial Analysis
Maps and Choropleths
# Scatter on a map
df = px.data.gapminder().query("year == 2007")
fig = px.scatter_geo(df, locations="iso_alpha", color="continent",
hover_name="country", size="pop",
projection="natural earth")
# Detailed Mapbox Choropleth (Needs token or use open-street-map)
fig = px.choropleth_mapbox(df, geojson=counties, locations='fips', color='unemp',
color_continuous_scale="Viridis",
mapbox_style="carto-positron",
zoom=3, center = {"lat": 37.0902, "lon": -95.7129})
Layout and Styling (fig.update_*)
Fine-tuning the appearance
fig = px.scatter(df, x="x", y="y")
# Global layout updates
fig.update_layout(
title="Custom Styled Plot",
xaxis_title="Dimension X",
yaxis_title="Dimension Y",
font=dict(family="Courier New, monospace", size=18, color="RebeccaPurple"),
legend=dict(yanchor="top", y=0.99, xanchor="left", x=0.01),
plot_bgcolor="white"
)
# Axis specific updates
fig.update_xaxes(showgrid=True, gridwidth=1, gridcolor='LightPink')
fig.update_yaxes(zeroline=True, zerolinewidth=2, zerolinecolor='Black')
Advanced Interaction: Animations
df = px.data.gapminder()
fig = px.scatter(df, x="gdpPercap", y="lifeExp", animation_frame="year",
animation_group="country",
size="pop", color="continent", hover_name="country",
log_x=True, size_max=55, range_x=[100, 100000], range_y=[25, 90])
Practical Workflows
1. Interactive Scientific Report Export
def create_interactive_report(df, filename="report.html"):
"""Generates a multi-chart HTML report."""
fig1 = px.scatter(df, x="A", y="B", color="C")
fig2 = px.histogram(df, x="A", color="C")
with open(filename, 'a') as f:
f.write(fig1.to_html(full_html=False, include_plotlyjs='cdn'))
f.write(fig2.to_html(full_html=False, include_plotlyjs='cdn'))
# Useful for sharing findings with non-technical stakeholders
2. Financial Dashboard Fragment (Candlestick)
import pandas as pd
from datetime import datetime
df = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/finance-charts-apple.csv')
fig = go.Figure(data=[go.Candlestick(x=df['Date'],
open=df['AAPL.Open'],
high=df['AAPL.High'],
low=df['AAPL.Low'],
close=df['AAPL.Close'])])
# Remove rangeslider for cleaner look
fig.update_layout(xaxis_rangeslider_visible=False)
3. Mixing Subplots with go.Figure
from plotly.subplots import make_subplots
fig = make_subplots(rows=1, cols=2, subplot_titles=("Plot A", "Plot B"))
fig.add_trace(go.Scatter(x=[1, 2, 3], y=[4, 5, 6]), row=1, col=1)
fig.add_trace(go.Bar(x=[1, 2, 3], y=[2, 3, 5]), row=1, col=2)
fig.update_layout(height=600, width=800, title_text="Side-by-Side Comparison")
Performance Optimization
WebGL for Large Datasets
# For scatter plots with >10,000 points, use Scattergl (Graph Objects)
# or tell px to use webgl (available in newer versions)
fig = px.scatter(df, x="large_x", y="large_y", render_mode="webgl")
# WebGL drastically improves performance by using the GPU for rendering.
Common Pitfalls and Solutions
JSON Overhead in Notebooks
# ❌ Problem: Notebook file size explodes to 50MB
# ✅ Solution: Display as static image (requires kaleido) or use a different renderer
# fig.show(renderer="png") # Static
# OR: Clear output after viewing
Axis Scaling in Animations
# ❌ Problem: Axes jump around during animation
# ✅ Solution: Manually fix the ranges
fig = px.scatter(df, x="x", y="y", animation_frame="time",
range_x=[0, 100], range_y=[0, 100])
Handling Missing Categories in Legend
# ❌ Problem: Colors change when filtering data because categories disappear
# ✅ Solution: Pass a category_orders dictionary
fig = px.scatter(df, x="x", y="y", color="category",
category_orders={"category": ["A", "B", "C", "D"]})
Best Practices
- Use Plotly Express first - Start with
pxfor 90% of tasks; only usegowhen you need fine-grained control - Work with tidy DataFrames - Ensure one row per observation for easy mapping to visual attributes
- Use
update_layoutfor styling - Cleanly modify titles, fonts, and background colors without recreating figures - Save as HTML for sharing - Use
fig.write_html("plot.html")to share interactive charts with stakeholders - Leverage hover data - Add context to points without cluttering the plot
- Set figure templates - Use
template="plotly_dark"or"ggplot2"for instant professional styling - Use marginal plots - In
px.scatter, usemarginal_xandmarginal_yto quickly add histograms or boxplots - Optimize for large datasets - Use WebGL rendering or datashader for datasets with >50k points
- Fix axis ranges in animations - Use
range_xandrange_yto prevent axes from jumping during animations - Set category orders - Use
category_ordersto maintain consistent colors when filtering data
Plotly bridges the gap between static analysis and interactive discovery. It is the best tool for moving scientific insights from a notebook to the web.
More from tondevrel/scientific-agent-skills
xgboost-lightgbm
Industry-standard gradient boosting libraries for tabular data and structured datasets. XGBoost and LightGBM excel at classification and regression tasks on tables, CSVs, and databases. Use when working with tabular machine learning, gradient boosting trees, Kaggle competitions, feature importance analysis, hyperparameter tuning, or when you need state-of-the-art performance on structured data.
193opencv
Open Source Computer Vision Library (OpenCV) for real-time image processing, video analysis, object detection, face recognition, and camera calibration. Use when working with images, videos, cameras, edge detection, contours, feature detection, image transformations, object tracking, optical flow, or any computer vision task.
142ortools
Google Optimization Tools. An open-source software suite for optimization, specialized in vehicle routing, flows, integer and linear programming, and constraint programming. Features the world-class CP-SAT solver. Use for vehicle routing problems (VRP), scheduling, bin packing, knapsack problems, linear programming (LP), integer programming (MIP), network flows, constraint programming, combinatorial optimization, resource allocation, shift scheduling, job-shop scheduling, and discrete optimization problems.
75matplotlib
The foundational library for creating static, animated, and interactive visualizations in Python. Highly customizable and the industry standard for publication-quality figures. Use for 2D plotting, scientific data visualization, heatmaps, contours, vector fields, multi-panel figures, LaTeX-formatted plots, custom visualization tools, and plotting from NumPy arrays or Pandas DataFrames.
71scipy
Comprehensive guide for SciPy - the fundamental library for scientific and technical computing in Python. Use for integration, optimization, interpolation, linear algebra, signal processing, statistics, ODEs, Fourier transforms, and advanced scientific algorithms. Built on NumPy and essential for research and engineering.
51numpy
Comprehensive guide for NumPy - the fundamental package for scientific computing in Python. Use for array operations, linear algebra, random number generation, Fourier transforms, mathematical functions, and high-performance numerical computing. Foundation for SciPy, pandas, scikit-learn, and all scientific Python.
46