NYC
skills/smithery/ai/drone-cv-expert

drone-cv-expert

SKILL.md

Drone CV Expert

Expert in robotics, drone systems, and computer vision for autonomous aerial platforms.

Decision Tree: When to Use This Skill

User mentions drones or UAVs?
├─ YES → Is it about inspection/detection of specific things (fire, roof damage, thermal)?
│        ├─ YES → Use drone-inspection-specialist
│        └─ NO → Is it about flight control, navigation, or general CV?
│                ├─ YES → Use THIS SKILL (drone-cv-expert)
│                └─ NO → Is it about GPU rendering/shaders?
│                        ├─ YES → Use metal-shader-expert
│                        └─ NO → Use THIS SKILL as default drone skill
└─ NO → Is it general object detection without drone context?
        ├─ YES → Use clip-aware-embeddings or other CV skill
        └─ NO → Probably not a drone question

Core Competencies

Flight Control & Navigation

  • PID Tuning: Position, velocity, attitude control loops
  • SLAM: ORB-SLAM, LSD-SLAM, visual-inertial odometry (VIO)
  • Path Planning: A*, RRT, RRT*, Dijkstra, potential fields
  • Sensor Fusion: EKF, UKF, complementary filters
  • GPS-Denied Navigation: AprilTags, visual odometry, LiDAR SLAM

Computer Vision

  • Object Detection: YOLO (v5/v8/v10), EfficientDet, SSD
  • Tracking: ByteTrack, DeepSORT, SORT, optical flow
  • Edge Deployment: TensorRT, ONNX, OpenVINO optimization
  • 3D Vision: Stereo depth, point clouds, structure-from-motion

Hardware Integration

  • Flight Controllers: Pixhawk, Ardupilot, PX4, DJI
  • Protocols: MAVLink, DroneKit, MAVSDK
  • Edge Compute: Jetson (Nano/Xavier/Orin), Coral TPU
  • Sensors: IMU, GPS, barometer, LiDAR, depth cameras

Anti-Patterns to Avoid

1. "Simulation-Only Syndrome"

Wrong: Testing only in Gazebo/AirSim, then deploying directly to real drone. Right: Simulation → Bench test → Tethered flight → Controlled environment → Field.

2. "EKF Overkill"

Wrong: Using Extended Kalman Filter when complementary filter suffices. Right: Match filter complexity to requirements:

  • Complementary filter: Basic stabilization, attitude only
  • EKF: Multi-sensor fusion, GPS+IMU+baro
  • UKF: Highly nonlinear systems, aggressive maneuvers

3. "Max Resolution Assumption"

Wrong: Processing 4K frames at 30fps expecting real-time performance. Right: Resolution trade-offs by altitude/speed:

Altitude Speed Resolution FPS Rationale
<30m Slow 1920x1080 30 Detail needed
30-100m Medium 1280x720 30 Balance
>100m Fast 640x480 60 Speed priority

4. "Single-Thread Processing"

Wrong: Sequential detect → track → control in one loop. Right: Pipeline parallelism:

Thread 1: Camera capture (async)
Thread 2: Object detection (GPU)
Thread 3: Tracking + state estimation
Thread 4: Control commands

5. "GPS Trust"

Wrong: Assuming GPS is always accurate and available. Right: Multi-source position estimation:

  • GPS: 2-5m accuracy outdoor, unavailable indoor
  • Visual odometry: 0.1-1% drift, lighting dependent
  • AprilTags: cm-level accuracy where deployed
  • IMU: Short-term only, drift accumulates

6. "One Model Fits All"

Wrong: Using same YOLO model for all scenarios. Right: Model selection by constraint:

Constraint Model Notes
Latency critical YOLOv8n 6ms inference
Balanced YOLOv8s 15ms, better accuracy
Accuracy first YOLOv8x 50ms, highest mAP
Edge device YOLOv8n + TensorRT 3ms on Jetson

Problem-Solving Framework

1. Constraint Analysis

  • Compute: What hardware? (Jetson Nano = ~5 TOPS, Xavier = 32 TOPS)
  • Power: Battery capacity? Flight time impact?
  • Latency: Control loop rate? Detection response time?
  • Weight: Payload capacity? Center of gravity?
  • Environment: Indoor/outdoor? GPS available? Lighting conditions?

2. Algorithm Selection Matrix

Problem Classical Approach Deep Learning When to Use Each
Feature tracking KLT optical flow FlowNet Classical: Real-time, limited compute. DL: Robust, more compute
Object detection HOG+SVM YOLO/SSD Classical: Simple objects, no GPU. DL: Complex, GPU available
SLAM ORB-SLAM DROID-SLAM Classical: Mature, debuggable. DL: Better in challenging scenes
Path planning A*, RRT RL-based Classical: Known environments. DL: Complex, dynamic

3. Safety Checklist

  • Kill switch tested and accessible
  • Geofence configured
  • Return-to-home altitude set
  • Low battery action defined
  • Signal loss action defined
  • Propeller guards (if applicable)
  • Pre-flight sensor calibration
  • Weather conditions checked

Quick Reference Tables

MAVLink Message Types

Message Purpose Frequency
HEARTBEAT Connection alive 1 Hz
ATTITUDE Roll/pitch/yaw 10-100 Hz
LOCAL_POSITION_NED Position 10-50 Hz
GPS_RAW_INT Raw GPS 1-10 Hz
SET_POSITION_TARGET Commands As needed

Kalman Filter Tuning

Matrix High Values Low Values
Q (process noise) Trust measurements more Trust model more
R (measurement noise) Trust model more Trust measurements more
P (initial covariance) Uncertain initial state Confident initial state

Common Coordinate Frames

Frame Origin Axes Use
NED Takeoff point North-East-Down Navigation
ENU Takeoff point East-North-Up ROS standard
Body Drone CG Forward-Right-Down Control
Camera Lens center Right-Down-Forward Vision

Reference Files

Detailed implementations in references/:

  • navigation-algorithms.md - SLAM, path planning, localization
  • sensor-fusion-ekf.md - Kalman filters, multi-sensor fusion
  • object-detection-tracking.md - YOLO, ByteTrack, optical flow

Simulation Tools

Tool Strengths Weaknesses Best For
Gazebo ROS integration, physics Graphics quality ROS development
AirSim Photorealistic, CV-focused Windows-centric Vision algorithms
Webots Multi-robot, accessible Less drone-specific Swarm simulations
MATLAB/Simulink Control design Not real-time Controller tuning

Emerging Technologies (2024-2025)

  • Event cameras: 1μs temporal resolution, no motion blur
  • Neuromorphic computing: Loihi 2 for ultra-low-power inference
  • 4D Radar: Velocity + 3D position, works in all weather
  • Swarm autonomy: Decentralized coordination, emergent behavior
  • Foundation models: SAM, CLIP for zero-shot detection

Integration Points

  • drone-inspection-specialist: Domain-specific detection (fire, damage, thermal)
  • metal-shader-expert: GPU-accelerated vision processing, custom shaders
  • collage-layout-expert: Report generation, visual composition

Key Principle: In drone systems, reliability trumps performance. A 95% accurate system that never crashes is better than 99% accurate that fails unpredictably. Always have fallbacks.

Weekly Installs
1
Repository
smithery/ai
First Seen
13 days ago
Installed on
claude-code1