Skip to content

LeRobot Format Specification

Detailed specification of the LeRobot dataset format.

File Structure

Directory Layout

dataset_root/
├── meta/
│   ├── info.json              # Required: Dataset metadata
│   ├── tasks.json             # Optional: Task grouping
│   ├── stats.json             # Optional: Dataset statistics
│   └── episodes.jsonl         # Optional: Episode-level metadata
├── episodes/
│   ├── episode_000000.parquet # Episode 0 data
│   ├── episode_000001.parquet # Episode 1 data
│   └── ...
└── videos/
    ├── observation.image/
    │   ├── episode_000000.mp4 # Video for episode 0
    │   ├── episode_000001.mp4
    │   └── ...
    └── observation.wrist_image/ # Additional camera views
        └── ...

Parquet Schema

Required Columns

{
    'episode_index': 'int64',       # Episode number
    'frame_index': 'int64',         # Frame within episode
    'timestamp': 'float64',         # Timestamp in seconds
    'action': 'list<float32>',      # Action vector
}

Optional Columns

{
    # Observations
    'observation.state': 'list<float32>',        # Proprioceptive state
    'observation.image': 'string',               # Path to image/video frame
    'observation.wrist_image': 'string',         # Wrist camera
    'observation.depth': 'string',               # Depth image
    'observation.goal': 'list<float32>',         # Goal state

    # Language
    'language_instruction': 'string',            # Task description

    # Episode info
    'next.done': 'bool',                         # Episode end
    'next.success': 'bool',                      # Task success
    'next.reward': 'float32',                    # Reward signal
}

Metadata Files

info.json

Complete specification:

{
  "codebase_version": "2.0",
  "robot_type": "franka",           # Robot platform
  "total_episodes": 1000,
  "total_frames": 50000,
  "fps": 30,                        # Frames per second
  "encoding": {
    "video_codec": "h264",          # Video encoding
    "video_quality": 23,            # CRF value (lower = better)
    "audio": false
  },
  "chunks_size": 1000,              # Frames per chunk (for large datasets)
  "shapes": {
    "observation.image": [3, 224, 224],
    "observation.state": [7],
    "action": [7]
  },
  "names": {
    "observation.state": ["q0", "q1", "q2", "q3", "q4", "q5", "q6"],
    "action": ["dx", "dy", "dz", "droll", "dpitch", "dyaw", "gripper"]
  },
  "stats": {
    "observation.state": {
      "mean": [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
      "std": [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0],
      "min": [-1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0],
      "max": [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
    },
    "action": {
      "mean": [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.5],
      "std": [0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.5]
    }
  }
}

tasks.json

{
  "tasks": [
    {
      "task_index": 0,
      "task_name": "pick_cube",
      "episodes": [0, 1, 2, 3, 4, 5],
      "success_rate": 0.95,
      "avg_episode_length": 50
    }
  ]
}

Data Types

Action Representation

# End-effector delta actions (recommended)
action = {
    'delta_pos': [dx, dy, dz],          # Positional change
    'delta_rot': [droll, dpitch, dyaw], # Rotational change
    'gripper': gripper_state            # Gripper (0=open, 1=closed)
}

# Joint space actions
action = {
    'joint_positions': [q0, q1, q2, q3, q4, q5, q6],
    'gripper': gripper_state
}

# Stored as flattened array
action_array = np.concatenate([delta_pos, delta_rot, [gripper]])

Image Storage

Images stored as video files:

# Path format in parquet
'observation.image': 'videos/observation.image/episode_000000.mp4#frame=42'

# Or as individual frames
'observation.image': 'videos/observation.image/episode_000000_frame_000042.png'

Normalization

Action Normalization

# Normalize to [-1, 1]
def normalize_action(action, stats):
    action_normalized = (action - stats['mean']) / stats['std']
    action_normalized = np.clip(action_normalized, -1, 1)
    return action_normalized

# Denormalize
def denormalize_action(action_normalized, stats):
    action = action_normalized * stats['std'] + stats['mean']
    return action

State Normalization

# Z-score normalization
def normalize_state(state, stats):
    return (state - stats['mean']) / stats['std']

Validation

Dataset Validation

from lerobot.common.datasets.utils import validate_dataset

# Validate dataset structure
errors = validate_dataset('data/my_dataset')

if errors:
    print("Validation errors:")
    for error in errors:
        print(f"  - {error}")
else:
    print("Dataset is valid!")

Common Issues

Issue Description Fix
Missing metadata info.json incomplete Add required fields
Shape mismatch Array sizes don't match spec Ensure consistent shapes
Invalid paths Video paths don't exist Check file paths
Inconsistent FPS FPS varies across episodes Resample to consistent FPS

Next Steps