LeRobot Format Specification¶
Detailed specification of the LeRobot dataset format.
File Structure¶
Directory Layout¶
dataset_root/
├── meta/
│ ├── info.json # Required: Dataset metadata
│ ├── tasks.json # Optional: Task grouping
│ ├── stats.json # Optional: Dataset statistics
│ └── episodes.jsonl # Optional: Episode-level metadata
├── episodes/
│ ├── episode_000000.parquet # Episode 0 data
│ ├── episode_000001.parquet # Episode 1 data
│ └── ...
└── videos/
├── observation.image/
│ ├── episode_000000.mp4 # Video for episode 0
│ ├── episode_000001.mp4
│ └── ...
└── observation.wrist_image/ # Additional camera views
└── ...
Parquet Schema¶
Required Columns¶
{
'episode_index': 'int64', # Episode number
'frame_index': 'int64', # Frame within episode
'timestamp': 'float64', # Timestamp in seconds
'action': 'list<float32>', # Action vector
}
Optional Columns¶
{
# Observations
'observation.state': 'list<float32>', # Proprioceptive state
'observation.image': 'string', # Path to image/video frame
'observation.wrist_image': 'string', # Wrist camera
'observation.depth': 'string', # Depth image
'observation.goal': 'list<float32>', # Goal state
# Language
'language_instruction': 'string', # Task description
# Episode info
'next.done': 'bool', # Episode end
'next.success': 'bool', # Task success
'next.reward': 'float32', # Reward signal
}
Metadata Files¶
info.json¶
Complete specification:
{
"codebase_version": "2.0",
"robot_type": "franka", # Robot platform
"total_episodes": 1000,
"total_frames": 50000,
"fps": 30, # Frames per second
"encoding": {
"video_codec": "h264", # Video encoding
"video_quality": 23, # CRF value (lower = better)
"audio": false
},
"chunks_size": 1000, # Frames per chunk (for large datasets)
"shapes": {
"observation.image": [3, 224, 224],
"observation.state": [7],
"action": [7]
},
"names": {
"observation.state": ["q0", "q1", "q2", "q3", "q4", "q5", "q6"],
"action": ["dx", "dy", "dz", "droll", "dpitch", "dyaw", "gripper"]
},
"stats": {
"observation.state": {
"mean": [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
"std": [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0],
"min": [-1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0],
"max": [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
},
"action": {
"mean": [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.5],
"std": [0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.5]
}
}
}
tasks.json¶
{
"tasks": [
{
"task_index": 0,
"task_name": "pick_cube",
"episodes": [0, 1, 2, 3, 4, 5],
"success_rate": 0.95,
"avg_episode_length": 50
}
]
}
Data Types¶
Action Representation¶
# End-effector delta actions (recommended)
action = {
'delta_pos': [dx, dy, dz], # Positional change
'delta_rot': [droll, dpitch, dyaw], # Rotational change
'gripper': gripper_state # Gripper (0=open, 1=closed)
}
# Joint space actions
action = {
'joint_positions': [q0, q1, q2, q3, q4, q5, q6],
'gripper': gripper_state
}
# Stored as flattened array
action_array = np.concatenate([delta_pos, delta_rot, [gripper]])
Image Storage¶
Images stored as video files:
# Path format in parquet
'observation.image': 'videos/observation.image/episode_000000.mp4#frame=42'
# Or as individual frames
'observation.image': 'videos/observation.image/episode_000000_frame_000042.png'
Normalization¶
Action Normalization¶
# Normalize to [-1, 1]
def normalize_action(action, stats):
action_normalized = (action - stats['mean']) / stats['std']
action_normalized = np.clip(action_normalized, -1, 1)
return action_normalized
# Denormalize
def denormalize_action(action_normalized, stats):
action = action_normalized * stats['std'] + stats['mean']
return action
State Normalization¶
# Z-score normalization
def normalize_state(state, stats):
return (state - stats['mean']) / stats['std']
Validation¶
Dataset Validation¶
from lerobot.common.datasets.utils import validate_dataset
# Validate dataset structure
errors = validate_dataset('data/my_dataset')
if errors:
print("Validation errors:")
for error in errors:
print(f" - {error}")
else:
print("Dataset is valid!")
Common Issues¶
| Issue | Description | Fix |
|---|---|---|
| Missing metadata | info.json incomplete | Add required fields |
| Shape mismatch | Array sizes don't match spec | Ensure consistent shapes |
| Invalid paths | Video paths don't exist | Check file paths |
| Inconsistent FPS | FPS varies across episodes | Resample to consistent FPS |
Next Steps¶
- Usage Guide - How to use the format
- Examples - Complete examples
- Data Collection - Collect your own data