LingBot-Map 3D Reconstruction Skill

Skill by ara.so — Daily 2026 Skills collection.

LingBot-Map is a feed-forward 3D foundation model that reconstructs scenes from streaming image or video data using a Geometric Context Transformer. It achieves ~20 FPS on 518×378 resolution over sequences exceeding 10,000 frames via paged KV cache attention.

What It Does

Streaming 3D reconstruction from image sequences or video
Feed-forward inference (no iterative optimization needed)
Outputs: point clouds with per-point confidence, camera poses, depth maps
Key features: anchor context, pose-reference window, trajectory memory for drift correction

Installation

# 1. Create environment
conda create -n lingbot-map python=3.10 -y
conda activate lingbot-map

lingbot-map-3d-reconstruction

LingBot-Map 3D Reconstruction Skill

What It Does

Installation