scikit-image
scikit-image - Scientific Image Processing
scikit-image treats images as NumPy arrays. It provides a comprehensive suite of algorithms for filtering, feature detection, and object measurement, making it the standard for research-grade image analysis.
When to Use
- Preprocessing scientific images (noise reduction, contrast enhancement).
- Image segmentation (separating cells, particles, or regions of interest).
- Feature extraction (detecting edges, corners, blobs, or textures).
- Geometric transformations (rescaling, rotating, warping).
- Morphological operations (thinning, skeletonization, hole filling).
- Measuring object properties (area, perimeter, eccentricity).
- Restoring degraded images (deconvolution, inpainting).
Reference Documentation
Official docs: https://scikit-image.org/
User Guide: https://scikit-image.org/docs/stable/user_guide.html
Search patterns: skimage.filters, skimage.segmentation, skimage.feature, skimage.morphology
Core Principles
Images are NumPy Arrays
A grayscale image is a 2D array (M, N). A color image is a 3D array (M, N, 3). A multichannel 3D volume is (P, M, N, C).
Coordinate System
The origin (0, 0) is at the top-left corner. Coordinates are always represented as (row, column).
Data Types and Ranges
scikit-image handles various dtypes with specific ranges:
uint8: 0 to 255uint16: 0 to 65535float: -1 to 1 or 0 to 1
Quick Reference
Installation
pip install scikit-image
Standard Imports
import numpy as np
import matplotlib.pyplot as plt
from skimage import io, filters, segmentation, feature, measure, morphology, color, util
Basic Pattern - Load and Filter
from skimage import io, filters, color
# Load image
image = io.imread('data.png')
# Convert to grayscale if needed
gray_image = color.rgb2gray(image)
# Apply a filter (e.g., Gaussian blur)
blurred = filters.gaussian(gray_image, sigma=2.0)
# Display
io.imshow(blurred)
io.show()
Critical Rules
✅ DO
- Use utility functions for conversion - Use
util.img_as_float,util.img_as_ubyteto handle rescaling automatically when changing dtypes. - Grayscale for analysis - Most feature extraction and segmentation algorithms expect 2D grayscale arrays.
- Check image shape - Always verify
image.shapeandimage.dtypebefore processing. - Vectorize - Use NumPy operations instead of loops over pixels.
- Apply filters before segmentation - Denoising (Gaussian, Median) significantly improves segmentation results.
- Use label for object counting -
measure.labelis the standard way to identify connected components.
❌ DON'T
- Manually rescale dtypes - Avoid
image / 255.0; useutil.img_as_float. - Ignore the "Coordinate Warning" - Be careful with (x, y) vs (row, col). scikit-image uses (row, col).
- Modify the input image - Most functions return a new array; work with copies if you need to mutate.
- Apply color-sensitive filters to grayscale - Some filters behave differently on 3D vs 2D arrays.
Anti-Patterns (NEVER)
# ❌ BAD: Manual dtype conversion (doesn't handle scaling)
image_float = image.astype(float) # Values stay 0-255 but as floats
# ✅ GOOD: Safe utility conversion
from skimage import util
image_float = util.img_as_float(image) # Values scaled to 0.0-1.0
# ❌ BAD: Looping over pixels
for r in range(rows):
for c in range(cols):
if image[r, c] > 128:
image[r, c] = 255
# ✅ GOOD: Vectorized thresholding
image[image > 0.5] = 1.0
# ❌ BAD: Plotting multiple images without a shared axis
plt.imshow(img1); plt.show()
plt.imshow(img2); plt.show()
# ✅ GOOD: Using subplots for comparison
fig, ax = plt.subplots(1, 2)
ax[0].imshow(img1, cmap='gray')
ax[1].imshow(img2, cmap='gray')
Filtering and Restoration (skimage.filters)
Denoising and Edge Detection
from skimage import filters
# Edge detection
edges_sobel = filters.sobel(image)
edges_canny = feature.canny(image, sigma=3)
# Denoising
denoised = filters.median(image, morphology.disk(3))
# Thresholding (Automatic)
val = filters.threshold_otsu(image)
binary = image > val
Morphology (skimage.morphology)
Shaping and Structural Analysis
from skimage import morphology
# Binary morphology
struct_element = morphology.disk(5)
eroded = morphology.erosion(binary, struct_element)
dilated = morphology.dilation(binary, struct_element)
opened = morphology.opening(binary, struct_element) # Erosion then Dilation
# Skeletonization (Reducing objects to 1-pixel width)
skeleton = morphology.skeletonize(binary)
# Removing small objects
clean_binary = morphology.remove_small_objects(binary, min_size=64)
Segmentation (skimage.segmentation)
Separating Objects
from skimage import segmentation, color
# Watershed Segmentation (Marker-based)
from scipy import ndimage as ndi
distance = ndi.distance_transform_edt(binary)
coords = feature.peak_local_max(distance, footprint=np.ones((3, 3)), labels=binary)
mask = np.zeros(distance.shape, dtype=bool)
mask[tuple(coords.T)] = True
markers, _ = ndi.label(mask)
labels = segmentation.watershed(-distance, markers, mask=binary)
# SLIC (Superpixels)
segments = segmentation.slic(image, n_segments=100, compactness=10)
out = color.label2rgb(segments, image, kind='avg')
Feature Detection (skimage.feature)
Keypoints and Textures
from skimage import feature
# Local Binary Patterns (Texture analysis)
lbp = feature.local_binary_pattern(image, P=8, R=1)
# Corner detection (Harris)
coords = feature.corner_peaks(feature.corner_harris(image), min_distance=5)
# Blob detection (Difference of Gaussian)
blobs = feature.blob_dog(image, max_sigma=30, threshold=.1)
# blobs: array of [row, col, sigma]
Measurements (skimage.measure)
Quantifying Results
from skimage import measure
# Label connected components
labels = measure.label(binary)
# Calculate properties
props = measure.regionprops(labels)
for prop in props:
print(f"Label: {prop.label}")
print(f"Area: {prop.area}")
print(f"Centroid: {prop.centroid}")
print(f"Eccentricity: {prop.eccentricity}")
# Find contours
contours = measure.find_contours(binary, 0.8)
Practical Workflows
1. Particle Counting Pipeline
def count_particles(image_path):
# 1. Load and grayscale
img = color.rgb2gray(io.imread(image_path))
# 2. Denoise
img_denoised = filters.gaussian(img, sigma=1)
# 3. Threshold
thresh = filters.threshold_otsu(img_denoised)
binary = img_denoised < thresh # Assuming dark particles
# 4. Morphological cleaning
binary = morphology.remove_small_objects(binary, 50)
binary = morphology.closing(binary, morphology.disk(3))
# 5. Label and measure
labels = measure.label(binary)
return measure.regionprops(labels), labels
# properties, label_img = count_particles('samples.tif')
2. Micrograph Analysis (Nuclei Segmentation)
def segment_nuclei(dna_image):
"""Identify nuclei in a DAPI/Hoechst stained image."""
# Local thresholding for uneven illumination
local_thresh = filters.threshold_local(dna_image, block_size=35)
binary = dna_image > local_thresh
# Fill holes
filled = ndi.binary_fill_holes(binary)
# Separate touching nuclei
distance = ndi.distance_transform_edt(filled)
local_maxi = feature.peak_local_max(distance, indices=False, footprint=np.ones((15, 15)), labels=filled)
markers = measure.label(local_maxi)
labels = segmentation.watershed(-distance, markers, mask=filled)
return labels
3. Change Detection (Image Subtraction)
def detect_change(img_before, img_after, threshold=0.1):
# Ensure float
im1 = util.img_as_float(color.rgb2gray(img_before))
im2 = util.img_as_float(color.rgb2gray(img_after))
# Difference
diff = np.abs(im1 - im2)
# Filter noise in difference
diff_clean = filters.median(diff, morphology.disk(2))
return diff_clean > threshold
Performance Optimization
Using skimage.util.view_as_windows
from skimage.util import view_as_windows
# Create overlapping patches without copying data (using strides)
# Fast for local neighborhood calculations
patches = view_as_windows(image, (64, 64), step=32)
# Shape: (n_patches_r, n_patches_c, 64, 64)
Parallel Processing
# Many skimage functions are already Cython-optimized.
# For batch processing, use standard joblib or multiprocessing.
from joblib import Parallel, delayed
results = Parallel(n_jobs=-1)(delayed(filters.gaussian)(img) for img in image_list)
Common Pitfalls and Solutions
Handling Multichannel Images
# ❌ Problem: Filters failing on RGB images
# result = filters.median(rgb_image) # Raises error
# ✅ Solution: Filter per channel or convert
# filters.median(color.rgb2gray(rgb_image))
# OR apply to each channel
# result = np.stack([filters.median(rgb_image[..., i]) for i in range(3)], axis=-1)
Memory issues with label
# ❌ Problem: Large noisy images produce millions of 1-pixel labels
# ✅ Solution: Clear small noise before labeling
binary = morphology.remove_small_objects(binary, min_size=10)
labels = measure.label(binary)
Coordinates: (X, Y) vs (Row, Col) confusion
# ❌ Problem: Points appear swapped on plot
# Matplotlib: plt.plot(x, y)
# scikit-image: image[row, col]
# ✅ Solution: Explicitly swap for visualization
coords = feature.peak_local_max(img) # Returns [row, col]
plt.plot(coords[:, 1], coords[:, 0], 'r.') # Plot as [x, y]
scikit-image provides the mathematical rigor needed for scientific discovery. By building on the NumPy ecosystem, it allows for a seamless workflow from raw sensor data to quantifiable insights.
More from tondevrel/scientific-agent-skills
xgboost-lightgbm
Industry-standard gradient boosting libraries for tabular data and structured datasets. XGBoost and LightGBM excel at classification and regression tasks on tables, CSVs, and databases. Use when working with tabular machine learning, gradient boosting trees, Kaggle competitions, feature importance analysis, hyperparameter tuning, or when you need state-of-the-art performance on structured data.
194opencv
Open Source Computer Vision Library (OpenCV) for real-time image processing, video analysis, object detection, face recognition, and camera calibration. Use when working with images, videos, cameras, edge detection, contours, feature detection, image transformations, object tracking, optical flow, or any computer vision task.
143ortools
Google Optimization Tools. An open-source software suite for optimization, specialized in vehicle routing, flows, integer and linear programming, and constraint programming. Features the world-class CP-SAT solver. Use for vehicle routing problems (VRP), scheduling, bin packing, knapsack problems, linear programming (LP), integer programming (MIP), network flows, constraint programming, combinatorial optimization, resource allocation, shift scheduling, job-shop scheduling, and discrete optimization problems.
75matplotlib
The foundational library for creating static, animated, and interactive visualizations in Python. Highly customizable and the industry standard for publication-quality figures. Use for 2D plotting, scientific data visualization, heatmaps, contours, vector fields, multi-panel figures, LaTeX-formatted plots, custom visualization tools, and plotting from NumPy arrays or Pandas DataFrames.
73plotly
A high-level interactive graphing library for Python. Ideal for web-based visualizations, 3D plots, and complex interactive dashboards. Built on plotly.js, it allows users to zoom, pan, and hover over data points in a browser-based environment. Use for interactive charts, web applications, Jupyter notebooks, 3D data visualization, geographic maps, financial charts, animations, time-series analysis, and building production-ready dashboards with Dash.
51scipy
Comprehensive guide for SciPy - the fundamental library for scientific and technical computing in Python. Use for integration, optimization, interpolation, linear algebra, signal processing, statistics, ODEs, Fourier transforms, and advanced scientific algorithms. Built on NumPy and essential for research and engineering.
51