Skip to content

Visual SLAM

Simultaneous Localization and Mapping provides accurate position estimation in GPS-denied underwater environments.

Overview

AQUA Stack's Visual SLAM system enables underwater vehicles to:

  • Localize accurately without GPS
  • Map the environment in 3D
  • Detect loop closures to correct drift
  • Fuse multiple sensors for robustness

Architecture

graph LR
    A[Camera] --> B[Feature Detection]
    B --> C[Tracking]
    C --> D[Visual Odometry]
    D --> E[Local Mapping]
    E --> F[Loop Closure]
    F --> G[Pose Graph Opt]
    G --> H[Global Map]

    I[IMU] --> D
    J[Depth] --> D

    style E fill:#4CAF50
    style F fill:#FF9800
    style G fill:#2196F3

SLAM Pipeline

1. Feature Detection

Extract distinctive visual features from images:

  • ORB (Oriented FAST and Rotated BRIEF): Fast, rotation invariant
  • SIFT: Scale and rotation invariant, slower
  • SuperPoint: Deep learning-based, most robust
# Configure feature detector
feature_config = {
    'type': 'ORB',
    'num_features': 1000,
    'scale_factor': 1.2,
    'num_levels': 8
}

2. Visual Odometry

Estimate motion between frames:

def visual_odometry(frame_current, frame_previous):
    # Match features
    matches = match_features(frame_current.features, 
                             frame_previous.features)

    # Estimate motion (Essential matrix)
    E, inliers = estimate_essential_matrix(matches)

    # Recover pose
    R, t = recover_pose(E, inliers)

    return Transform(R, t)

3. Local Mapping

Create local 3D map:

  • Track map points across frames
  • Triangulate new 3D points
  • Optimize local pose graph
  • Cull outliers and redundant keyframes

4. Loop Closure Detection

Detect when vehicle returns to previously visited areas:

def detect_loop_closure(current_frame, map_database):
    # Compute visual similarity
    candidates = map_database.query_similar(current_frame)

    for candidate in candidates:
        # Geometric verification
        if verify_geometric_consistency(current_frame, candidate):
            return LoopClosure(current_frame, candidate)

    return None

5. Pose Graph Optimization

Correct accumulated drift using loop closures:

def optimize_pose_graph(keyframes, loop_closures):
    # Build pose graph
    graph = PoseGraph()

    # Add odometry edges
    for i in range(len(keyframes) - 1):
        graph.add_edge(keyframes[i], keyframes[i+1], 
                       odometry_constraint)

    # Add loop closure edges
    for loop in loop_closures:
        graph.add_edge(loop.frame1, loop.frame2, 
                       loop_constraint)

    # Optimize
    optimized_poses = graph.optimize()

    return optimized_poses

Sensor Fusion

Combine visual SLAM with other sensors:

IMU Integration

Pre-integrate IMU measurements between frames:

def imu_preintegration(imu_measurements, dt):
    # Initialize
    delta_p = np.zeros(3)
    delta_v = np.zeros(3)
    delta_R = np.eye(3)

    for imu in imu_measurements:
        # Update rotation
        omega = imu.gyro
        delta_R = delta_R @ exp_map(omega * dt)

        # Update velocity
        accel = delta_R @ imu.accel
        delta_v += accel * dt

        # Update position
        delta_p += delta_v * dt

    return (delta_p, delta_v, delta_R)

Depth Sensor Fusion

Use depth sensor to resolve scale ambiguity:

def fuse_depth(visual_scale, depth_measurement):
    # Visual SLAM provides scale-ambiguous estimate
    # Depth sensor provides absolute scale

    scale_factor = depth_measurement / visual_scale

    # Scale all map points and trajectory
    scaled_map = apply_scale(map_points, scale_factor)
    scaled_trajectory = apply_scale(trajectory, scale_factor)

    return scaled_map, scaled_trajectory

DVL Integration (Optional)

Doppler Velocity Log provides velocity measurements:

  • Reduces drift significantly (<0.5%)
  • Especially important for long missions
  • Expensive but worthwhile for commercial operations

Performance

Typical SLAM performance:

Metric Value Conditions
Drift rate 0.5-2% Visual-inertial
Drift rate (with DVL) <0.5% All sensors
Keyframe rate 1-5 Hz Depends on motion
Loop closure detection >90% Good revisits
Map points 10K-100K Depends on environment

Configuration

Key SLAM parameters:

slam:
  frontend:
    feature_detector: ORB
    num_features: 1000
    min_features: 100

  tracking:
    ransac_threshold: 3.0
    min_inliers: 20

  mapping:
    keyframe_distance: 1.0
    keyframe_angle: 15.0
    culling_redundancy: 0.9

  loop_closure:
    enabled: true
    min_score: 0.7
    geometric_check: true

  optimization:
    local_window: 20
    global_every_n_keyframes: 10

Best Practices

DO: - Ensure good lighting - Move slowly for initialization - Add visual texture if needed - Monitor feature count - Enable loop closure

DON'T: - Move too fast (>1m/s without DVL) - Operate in featureless environments - Ignore SLAM warnings - Disable sensor fusion

Learn More

Troubleshooting

Common SLAM issues: