Score Lerobot Episodes
| Entity Passport | |
| Registry ID | gh-model--roboticsdata--score_lerobot_episodes |
| License | Apache-2.0 |
| Provider | github |
Cite this model
Academic & Research Attribution
@misc{gh_model__roboticsdata__score_lerobot_episodes,
author = {RoboticsData},
title = {Score Lerobot Episodes Model},
year = {2026},
howpublished = {\url{https://github.com/roboticsdata/score_lerobot_episodes}},
note = {Accessed via Free2AITools Knowledge Fortress}
} đŦTechnical Deep Dive
Full Specifications [+]âž
Quick Commands
git clone https://github.com/roboticsdata/score_lerobot_episodes âī¸ Nexus Index V2.0
đŦ Index Insight
FNI V2.0 for Score Lerobot Episodes: Semantic (S:50), Authority (A:0), Popularity (P:41), Recency (R:91), Quality (Q:70).
Verification Authority
đ What's Next?
Technical Deep Dive
A lightweight toolkit for quantitatively scoring LeRobot episodes.
[!NOTE] The features in this repository are now integrated into Robotdata Studio.
- Instant ~20% quality boost with our data capture platform
- Seamless integration with your existing LeRobot datasets
- Powerful diversity scoring and data sanitization techniques
**LeRobot Episode Scoring Toolkit**
A comprehensive toolkit for evaluating and filtering LeRobot episode datasets based on multiple quality dimensions. It combines classic Computer Vision heuristics (blur/exposure tests, kinematic smoothness, collision spikes) with optional Gemini-powered vision-language checks to give each episode a 0â1 score across multiple quality dimensions.
Use this toolkit to:
- Automatically score robot demonstration episodes on visual clarity, motion smoothness, collision detection, and more
- Filter low-quality episodes to improve downstream training performance
- Train and compare baseline vs. filtered dataset models
- Visualize score distributions and identify problematic episodes
Table of Contents
- Features
- Installation
- Quick Start
- Usage
- Output Format
- Repository Structure
- Training and Evaluation
- Troubleshooting
- Contributing
- License
⨠Features
| Dimension | Function | What it measures |
|---|---|---|
| Visual clarity | score_visual_clarity |
Blur, over/under-exposure, low-light frames |
| Smoothness | score_smoothness |
2nd derivative of joint angles |
| Path efficiency | score_path_efficiency |
Ratio of straight-line vs. actual joint-space path |
| Collision / spikes | score_collision |
Sudden acceleration outliers (proxy for contacts) |
| Joint stability (final 2 s) | score_joint_stability |
Stillness at the goal pose |
| Gripper consistency | score_gripper_consistency |
Binary "closed vs. holding" agreement |
| Actuator saturation | score_actuator_saturation |
Difference between commanded actions and achieved states |
| Task success (VLM) | score_task_success (via VLMInterface) |
Gemini grades whether the desired behaviour happened |
| Task success (VLM) | score_task_success (via VLMInterface) |
Gemini grades whether the desired behavior happened |
| Runtime penalty / outliers | score_runtime + build_time_stats, is_time_outlier |
Episode length vs. nominal / Tukey-IQR / Z-score fences |
âī¸ Installation
Prerequisites
- Python 3.8 or higher
- pip package manager
Setup
Clone the repository
bashgit clone https://github.com/RoboticsData/score_lerobot_episodes.git cd score_lerobot_episodesInstall dependencies
bash# Install in editable mode with all dependencies pip install -e .Or using uv (faster):
bash# Install uv if you haven't already pip install uv # Install the package uv pip install -e .Set up API keys (optional)
Only required if using VLM-based scoring with Gemini:
bashexport GOOGLE_API_KEY="your-api-key-here"Note: The free tier rate limits of the Gemini API are fairly restrictive and might need to be upgraded depending on episode length. Check Gemini API rate limits for more info.
đ Quick Start
Score a dataset and save results:
python score_dataset.py \
--repo_id lerobot/aloha_static_pro_pencil \
--output ./output/lerobot/aloha_static_pro_pencil \
--threshold 0.5
This will:
- Download and load the dataset from HuggingFace
- Score each episode across multiple quality dimensions
- Save scores to output path
- Filter episodes with aggregate score >= 0.5
- Save the filtered dataset to the output directory
đ Usage
Command-line Arguments
Required Arguments
--repo_id: HuggingFace repository ID for the dataset (e.g.,username/dataset-name)
Optional Arguments
--root: Local path to dataset root (default: downloads from HuggingFace Hub)--output: Output directory for filtered dataset (default: None, no filtering)--threshold: Minimum aggregate score to keep episodes (default: 0.5, range: 0.0-1.0)--nominal: Expected episode duration in seconds (used for runtime scoring)--vision_type: Vision scoring method, choices:opencv(default),vlm_gemini--policy_name: Policy type for training (default:act)--overwrite: Overwrite existing filtered dataset (default: True)--overwrite_checkpoint: Overwrite existing training checkpoints (default: False)--train-baseline: Train model on unfiltered dataset (default: False)--train-filtered: Train model on filtered dataset (default: False)--plot: Display score distribution plots in terminal (default: False)
Examples
1. Basic scoring (no filtering)
python score_dataset.py --repo_id username/my-robot-dataset
2. Score and filter dataset
python score_dataset.py \
--repo_id username/my-robot-dataset \
--output ./output/username/my-robot-dataset \
--threshold 0.6
3. Score with VLM-based vision analysis
export GOOGLE_API_KEY="your-key"
python score_dataset.py \
--repo_id username/my-robot-dataset \
--vision_type vlm_gemini \
--output ./filtered_data
4. Score, filter, and train both baseline and filtered models
python score_dataset.py \
--repo_id username/my-robot-dataset \
--output ./output/username/my-robot-dataset \
--threshold 0.5 \
--train-baseline True \
--train-filtered True \
--policy_name act
5. Visualize distributions
python score_dataset.py \
--repo_id username/my-robot-dataset \
--threshold 0.7 \
--plot True
6. Use local dataset instead of downloading
python score_dataset.py \
--repo_id username/my-robot-dataset \
--root /path/to/local/dataset \
--output ./filtered_output
đ Output Format
JSON Scores File
Saved to results/{repo_id}_scores.json:
[
{
"episode_id": 0,
"camera_type": "camera_0",
"video_path": "/path/to/video.mp4",
"aggregate_score": 0.752,
"per_attribute_scores": {
"visual_clarity": 0.85,
"smoothness": 0.78,
"collision": 0.92,
"runtime": 0.65
}
},
...
]
Console Output
Displays a formatted table showing scores for each episode:
Episode scores (0â1 scale)
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
Episode Camera visual_clarity smoothness collision runtime Aggregate Status
0 camera_0 0.850 0.780 0.920 0.650 0.752 GOOD
1 camera_1 0.420 0.650 0.710 0.580 0.590 BAD
...
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
Average aggregate over 20 videos: 0.671
Percentage of episodes removed: 0.25, total: 5
Filtered Dataset
When using --output, a new filtered dataset is created with only episodes scoring above the threshold, maintaining the original LeRobot dataset structure.
đ Repository Structure
score_lerobot_episodes/
âââ src/
â âââ score_lerobot_episodes/ # Installable package
â âââ __init__.py
â âââ data.py # Dataset utilities
â âââ vlm.py # Vision-Language Model
â âââ evaluation.py # Evaluation utilities
â âââ corrupt.py # Data corruption tools
â âââ scores/ # Scoring criteria modules
âââ score_dataset.py # Main scoring script
âââ train.py # Training pipeline integration
âââ ui.py # Streamlit web interface (if available)
âââ pyproject.toml # Package configuration and dependencies
âââ requirements.txt # Python dependencies (legacy)
âââ README.md # This file
âââ CONTRIBUTING.md # Contribution guidelines
âââ LICENSE # Apache 2.0 license
âââ .gitignore # Git ignore rules
âââ results/ # Generated score JSON files
âââ output/ # Filtered datasets
âââ checkpoints/ # Training checkpoints
đ¤ Training and Evaluation
The toolkit integrates with LeRobot's training pipeline to compare baseline vs. filtered dataset performance.
Training Workflow
Baseline Training: Train on the original unfiltered dataset
bashpython score_dataset.py \ --repo_id username/dataset \ --train-baseline TrueFiltered Training: Train on the quality-filtered dataset
bashpython score_dataset.py \ --repo_id username/dataset \ --output ./filtered_data \ --threshold 0.6 \ --train-filtered TrueCompare Both: Run both training pipelines in one command
bashpython score_dataset.py \ --repo_id username/dataset \ --output ./filtered_data \ --train-baseline True \ --train-filtered True
Training Configuration
- Default policy: ACT (Action Chunking Transformer)
- Default steps: 10,000
- Batch size: 4
- Checkpoints saved to
./checkpoints/{job_name}/ - WandB logging enabled by default
You can customize training parameters by modifying train.py.
đ§ Troubleshooting
Common Issues
1. ModuleNotFoundError: No module named 'google.generativeai'
- Solution: Install dependencies with
pip install -r requirements.txt - If using VLM scoring, ensure
google-generativeaiis installed
2. API rate limit errors with Gemini
- Solution: The free tier has restrictive limits. Consider:
- Using
--vision_type opencvinstead - Upgrading to a paid Gemini API tier
- Processing smaller batches
- Using
3. All episodes filtered out
- Error:
ValueError: All episodes filtered out, decrease threshold to fix this - Solution: Lower the
--thresholdvalue (e.g., from 0.5 to 0.3)
4. Dataset not found
- Solution:
- Verify the
--repo_idis correct - Check internet connection for HuggingFace Hub access
- Use
--rootto specify a local dataset path
- Verify the
5. Out of memory during training
- Solution: Reduce
batch_sizeintrain.py:44or use a smaller model
6. Permission errors when overwriting
- Solution: Use
--overwrite Trueor manually delete the output directory
đ¤ Contributing
We welcome contributions! Please see CONTRIBUTING.md for guidelines on:
- Setting up a development environment
- Code style and conventions
- Submitting pull requests
- Reporting issues
Quick Contribution Steps
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
â Star History
đ License
LeRobot Episode Scoring Toolkit is distributed under the Apache 2.0 License. See LICENSE for more information.
đ§ Support
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Documentation: This README and inline code documentation
đ Quick Start
git clone https://github.com/RoboticsData/score_lerobot_episodes.git
cd score_lerobot_episodes
â ī¸ Incomplete Data
Some information about this model is not available. Use with Caution - Verify details from the original source before relying on this data.
View Original Source âđ Limitations & Considerations
- âĸ Benchmark scores may vary based on evaluation methodology and hardware configuration.
- âĸ VRAM requirements are estimates; actual usage depends on quantization and batch size.
- âĸ FNI scores are relative rankings and may change as new models are added.
- â License Unknown: Verify licensing terms before commercial use.
AI Summary: Based on GitHub metadata. Not a recommendation.
đĄī¸ Model Transparency Report
Technical metadata sourced from upstream repositories.
đ Identity & Source
- id
- gh-model--roboticsdata--score_lerobot_episodes
- slug
- roboticsdata--score_lerobot_episodes
- source
- github
- author
- RoboticsData
- license
- Apache-2.0
- tags
- computer-vision, gemini, lerobot, opencv, robotics, awesome, huggingface, machine-learning, robot, robots, ai, deeplearning, machinelearning, simulation, python
âī¸ Technical Specs
- architecture
- null
- params billions
- null
- context length
- null
- pipeline tag
- other
đ Engagement & Metrics
- downloads
- 0
- stars
- 0
- forks
- 0
Data indexed from public sources. Updated daily.