⚠️

This is a Dataset, not a Model

The following metrics do not apply: FNI Score, Deployment Options, Model Architecture

πŸ“Š

physicalai-autonomous-vehicle-cosmos-drive-dreams

FNI 20.9
by nvidia Dataset

"--- language: - en license: cc-by-4.0 size_categories: - n>1T task_categories: - robotics tags: - Video - physicalAI - AV github: https://github.com/nv-tlabs/Cosmos-Drive-Dreams --- Paper | Paper Website | GitHub We provide a download script to download our dataset. If you have enough space, you ca..."

Best Scenarios

✨ Data Science

Technical Constraints

Generic Use
- Size
- Rows
Parquet Format
35 Likes

Capabilities

  • βœ… Data Science

πŸ”¬Deep Dive

Expand Details [+]

πŸ› οΈ Technical Profile

⚑ Hardware & Scale

Size
-
Total Rows
-
Files
99352

🧠 Training & Env

Format
Parquet
Cleaning
Raw

🌐 Cloud & Rights

Source
huggingface
License
CC-BY-4.0

πŸ‘οΈ Data Preview

feature label split
example_text_1 0 train
example_text_2 1 train
example_text_3 0 test
example_text_4 1 validation
example_text_5 0 train
Showing 5 sample rows. Real-time preview requires login.

🧬 Schema & Configs

Fields

feature: string
label: int64
split: string

Dataset Card

PhysicalAI-Autonomous-Vehicle-Cosmos-Drive-Dreams

Paper | Paper Website | GitHub

Download

We provide a download script to download our dataset. If you have enough space, you can use git to download a dataset from huggingface.

bash
usage: download.py [-h] --odir ODIR
                                       [--file_types {hdmap,lidar,synthetic}[,…]]
                                       [--workers N] [--clean_cache]

required arguments: --odir ODIR Output directory where files are stored.

optional arguments: -h, --help Show this help message and exit. --file_types {hdmap,lidar,synthetic}[,…] Comma-separated list of data groups to fetch. β€’ hdmap β†’ common folders + 3d_* HD-map layers β€’ lidar β†’ common folders + lidar_raw β€’ synthetic β†’ common folders + cosmos_synthetic Default: hdmap,lidar,synthetic (all groups). --workers N Parallel download threads (default: 1). Increase on fast networks; reduce if you hit rate limits or disk bottlenecks. --clean_cache Delete the temporary HuggingFace cache after each run to reclaim disk space.

common folders (always downloaded, regardless of --file_types): all_object_info, captions, car_mask_coarse, ftheta_intrinsic, pinhole_intrinsic, pose, vehicle_pose

Here are some examples:

code
<h1 class="text-2xl font-bold mt-8 mb-4 text-gray-900 dark:text-white">Use this to download the download.py script </h1>
wget https://raw.githubusercontent.com/nv-tlabs/Cosmos-Drive-Dreams/main/scripts/download.py

<h1 class="text-2xl font-bold mt-8 mb-4 text-gray-900 dark:text-white">download all (about 3TB)</h1> python download.py --odir YOUR_DATASET_PATH --workers YOUR_WORKER_NUMBER

<h1 class="text-2xl font-bold mt-8 mb-4 text-gray-900 dark:text-white">download hdmap only</h1> python download.py --odir YOUR_DATASET_PATH --file_types hdmap --workers YOUR_WORKER_NUMBER

<h1 class="text-2xl font-bold mt-8 mb-4 text-gray-900 dark:text-white">download lidar only</h1> python download.py --odir YOUR_DATASET_PATH --file_types lidar --workers YOUR_WORKER_NUMBER

<h1 class="text-2xl font-bold mt-8 mb-4 text-gray-900 dark:text-white">download synthetic video only (about 700GB)</h1> python download.py --odir YOUR_DATASET_PATH --file_types synthetic --workers YOUR_WORKER_NUMBER

Dataset Description

This Cosmos-Drive-Dreams dataset contains labels for 5,843 10-second clips from RDS-HQ dataset, along with 81,802 synthetic video samples generated by Cosmos-Drive-Dreams from these labels. The synthetically generated video is 121-frame long, capturing a wide variety of challenging scenarios such as rainy, snowy, foggy etc that might not be as easily available in real world driving datasets. This dataset is ready for commercial/non-commercial AV only use.

Data

12,241 characters total