Skip to content

LeRobot Dataset Format

LeRobot is a standardized format for robotics datasets that enables easy sharing, reproduction, and benchmarking across different platforms.

Overview

LeRobot provides:

  • Unified data structure for multi-modal robotics data
  • Efficient storage using Parquet format
  • Easy loading with Python API
  • Hugging Face integration for dataset sharing
  • Standardized metadata for reproducibility

Dataset Structure

dataset_name/
├── meta/
│   ├── info.json           # Dataset metadata
│   ├── tasks.json          # Task descriptions
│   └── stats.json          # Statistics
├── episodes/
│   ├── episode_000000.parquet
│   ├── episode_000001.parquet
│   └── ...
└── videos/
    ├── observation.image/
    │   ├── episode_000000.mp4
    │   └── ...
    └── observation.wrist_image/
        └── ...

Quick Start

Installation

pip install lerobot

Loading a Dataset

from lerobot.common.datasets.lerobot_dataset import LeRobotDataset

# Load from Hugging Face Hub
dataset = LeRobotDataset(
    repo_id="lerobot/pusht",
    root="data/"
)

# Access episode
episode = dataset[0]
print(episode.keys())
# dict_keys(['observation.image', 'observation.state', 'action', 'episode_index', 'frame_index', 'timestamp'])

# Get specific observation
image = episode['observation.image']  # torch.Tensor
state = episode['observation.state']  # torch.Tensor
action = episode['action']            # torch.Tensor

Creating a Dataset

from lerobot.common.datasets.lerobot_dataset import LeRobotDataset
import torch

# Initialize dataset
dataset = LeRobotDataset.create(
    repo_id="username/my_dataset",
    root="data/my_dataset",
    robot_type="franka",
    fps=30
)

# Add episode
episode_data = {
    'observation.image': [],       # List of images
    'observation.state': [],       # List of robot states
    'action': [],                  # List of actions
    'timestamp': []                # List of timestamps
}

for step in range(num_steps):
    episode_data['observation.image'].append(camera.get_image())
    episode_data['observation.state'].append(robot.get_state())
    episode_data['action'].append(robot.get_action())
    episode_data['timestamp'].append(time.time())

dataset.add_episode(episode_data)

# Save
dataset.save()

# Push to Hugging Face Hub
dataset.push_to_hub()

Data Format Specification

Episode Structure

Each episode is stored as a Parquet file with columns:

Column Type Description
episode_index int Episode number
frame_index int Frame number within episode
timestamp float Timestamp in seconds
observation.* varies Observations (images, states, etc.)
action float[] Action taken
next.done bool Episode termination flag
next.success bool Task success flag (optional)
next.reward float Reward (optional, for RL)

Observation Types

# Image observations (stored as video paths)
'observation.image': str  # Path to frame in video
'observation.wrist_image': str

# State observations (stored as arrays)
'observation.state': float[]  # Robot joint positions
'observation.velocity': float[]  # Joint velocities

# Task information
'observation.goal': float[]  # Goal state/position
'instruction': str  # Natural language instruction

Action Format

# Continuous actions
action = np.array([x, y, z, roll, pitch, yaw, gripper])

# Normalized to [-1, 1]
action = (action - action_min) / (action_max - action_min) * 2 - 1

Metadata

info.json

{
  "fps": 30,
  "robot_type": "franka",
  "total_episodes": 1000,
  "total_frames": 50000,
  "video_codec": "h264",
  "shapes": {
    "observation.image": [3, 224, 224],
    "observation.state": [7],
    "action": [7]
  },
  "names": {
    "observation.state": ["q0", "q1", "q2", "q3", "q4", "q5", "q6"],
    "action": ["x", "y", "z", "roll", "pitch", "yaw", "gripper"]
  }
}

tasks.json

{
  "tasks": [
    {
      "task_index": 0,
      "task_name": "pick_red_cube",
      "episodes": [0, 1, 2, 10, 11, 12]
    },
    {
      "task_index": 1,
      "task_name": "pick_blue_cube",
      "episodes": [3, 4, 5, 13, 14, 15]
    }
  ]
}

Using with PyTorch

DataLoader Integration

from torch.utils.data import DataLoader
from lerobot.common.datasets.lerobot_dataset import LeRobotDataset

dataset = LeRobotDataset("lerobot/pusht")

# Create DataLoader
dataloader = DataLoader(
    dataset,
    batch_size=32,
    shuffle=True,
    num_workers=4,
    collate_fn=dataset.collate_fn
)

# Training loop
for batch in dataloader:
    observations = batch['observation']
    actions = batch['action']

    # Train model
    predicted_actions = model(observations)
    loss = criterion(predicted_actions, actions)
    loss.backward()

Custom Transforms

from torchvision import transforms

transform = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.RandomCrop((200, 200)),
    transforms.ColorJitter(brightness=0.2),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

dataset = LeRobotDataset(
    "lerobot/pusht",
    image_transforms=transform
)

Best Practices

Data Organization

  1. One task per dataset: Keep related episodes together
  2. Consistent structure: Same observations across all episodes
  3. Metadata completeness: Fill in all relevant metadata
  4. Video compression: Use H.264 for efficient storage

Performance Tips

# Use video backend for fast image loading
dataset = LeRobotDataset(
    "lerobot/pusht",
    video_backend="pyav"  # Faster than default
)

# Pre-load episodes for faster access
dataset.preload_episodes(range(100))

# Use memory mapping for large datasets
dataset = LeRobotDataset(
    "lerobot/pusht",
    use_mmap=True
)

Examples

Example Datasets

Browse available datasets:

from lerobot.common.datasets.lerobot_dataset import available_datasets

# List all available datasets
datasets = available_datasets()
for name in datasets:
    print(name)

# Output:
# lerobot/pusht
# lerobot/aloha_static
# lerobot/xarm_lift
# ...

Popular datasets:

  • lerobot/pusht: 2D pushing task
  • lerobot/aloha_static: Bimanual manipulation
  • lerobot/xarm_lift: Object lifting
  • lerobot/koch_pick: Pick and place

Next Steps