📊
Dataset

Fineweb Edu 100b Shuffle

by karpathy hf-dataset--karpathy--fineweb-edu-100b-shuffle
Nexus Index
31.5 Top 100%
S: Semantic 50
A: Authority 0
P: Popularity 55
R: Recency 34
Q: Quality 30
Tech Context
Vital Performance
0 DL / 30D
0.0%
Data Integrity 31.5 FNI Score
- Size
- Rows
Parquet Format
- Tokens
Dataset Information Summary
Entity Passport
Registry ID hf-dataset--karpathy--fineweb-edu-100b-shuffle
License odc-by
Provider huggingface
📜

Cite this dataset

Academic & Research Attribution

BibTeX
@misc{hf_dataset__karpathy__fineweb_edu_100b_shuffle,
  author = {karpathy},
  title = {Fineweb Edu 100b Shuffle Dataset},
  year = {2026},
  howpublished = {\url{https://huggingface.co/datasets/karpathy/fineweb-edu-100b-shuffle}},
  note = {Accessed via Free2AITools Knowledge Fortress}
}
APA Style
karpathy. (2026). Fineweb Edu 100b Shuffle [Dataset]. Free2AITools. https://huggingface.co/datasets/karpathy/fineweb-edu-100b-shuffle

đŸ”ŦTechnical Deep Dive

Full Specifications [+]

âš–ī¸ Nexus Index V2.0

31.5
TOP 100% SYSTEM IMPACT
Semantic (S) 50
Authority (A) 0
Popularity (P) 55
Recency (R) 34
Quality (Q) 30

đŸ’Ŧ Index Insight

FNI V2.0 for Fineweb Edu 100b Shuffle: Semantic (S:50), Authority (A:0), Popularity (P:55), Recency (R:34), Quality (Q:30).

Free2AITools Nexus Index

Verification Authority

Unbiased Data Node Refresh: VFS Live
âŦ‡ī¸
Downloads
45,444

đŸ‘ī¸ Data Preview

📊

Row-level preview not available for this dataset.

Schema structure is shown in the Field Logic panel when available.

🔗 Explore Full Dataset ↗

đŸ§Ŧ Field Logic

đŸ§Ŧ

Schema not yet indexed for this dataset.

Dataset Specification

📊 Structured Schema (Zero-Fabrication)

Feature Key Data Type
text string

Estimated Rows: 97,230,848

Social Proof

HuggingFace Hub
45.4KDownloads
🔄 Daily sync (03:00 UTC)

AI Summary: Based on Hugging Face metadata. Not a recommendation.

📊 FNI Methodology 📚 Knowledge Baseâ„šī¸ Verify with original source

đŸ›Ąī¸ Dataset Transparency Report

Technical metadata sourced from upstream repositories.

Open Metadata

🆔 Identity & Source

id
hf-dataset--karpathy--fineweb-edu-100b-shuffle
slug
karpathy--fineweb-edu-100b-shuffle
source
huggingface
author
karpathy
license
odc-by
tags
license:odc-by, size_categories:10m<n<100m, format:parquet, modality:text, library:datasets, library:dask, library:polars, library:mlcroissant, region:us

âš™ī¸ Technical Specs

architecture
null
params billions
100
context length
4,096
pipeline tag

📊 Engagement & Metrics

downloads
45,444
stars
161
forks
0

Data indexed from public sources. Updated daily.