📊
Dataset

Dolma3 Longmino Mix 100b 1125

by allenai hf-dataset--allenai--dolma3_longmino_mix-100b-1125
Nexus Index
40.2 Top 100%
S: Semantic 50
A: Authority 0
P: Popularity 56
R: Recency 73
Q: Quality 30
Tech Context
Vital Performance
0 DL / 30D
0.0%
Data Integrity 40.2 FNI Score
- Size
- Rows
Parquet Format
- Tokens
Dataset Information Summary
Entity Passport
Registry ID hf-dataset--allenai--dolma3_longmino_mix-100b-1125
License odc-by
Provider huggingface
📜

Cite this dataset

Academic & Research Attribution

BibTeX
@misc{hf_dataset__allenai__dolma3_longmino_mix_100b_1125,
  author = {allenai},
  title = {Dolma3 Longmino Mix 100b 1125 Dataset},
  year = {2026},
  howpublished = {\url{https://huggingface.co/datasets/allenai/dolma3_longmino_mix-100b-1125}},
  note = {Accessed via Free2AITools Knowledge Fortress}
}
APA Style
allenai. (2026). Dolma3 Longmino Mix 100b 1125 [Dataset]. Free2AITools. https://huggingface.co/datasets/allenai/dolma3_longmino_mix-100b-1125

đŸ”ŦTechnical Deep Dive

Full Specifications [+]

âš–ī¸ Nexus Index V2.0

40.2
TOP 100% SYSTEM IMPACT
Semantic (S) 50
Authority (A) 0
Popularity (P) 56
Recency (R) 73
Quality (Q) 30

đŸ’Ŧ Index Insight

FNI V2.0 for Dolma3 Longmino Mix 100b 1125: Semantic (S:50), Authority (A:0), Popularity (P:56), Recency (R:73), Quality (Q:30).

Free2AITools Nexus Index

Verification Authority

Unbiased Data Node Refresh: VFS Live
âŦ‡ī¸
Downloads
73,128

đŸ‘ī¸ Data Preview

📊

Row-level preview not available for this dataset.

Schema structure is shown in the Field Logic panel when available.

🔗 Explore Full Dataset ↗

đŸ§Ŧ Field Logic

đŸ§Ŧ

Schema not yet indexed for this dataset.

Dataset Specification

Logo for Longmino Mix

Dolma 3 Longmino Mix (100B)

The Dolma 3 Longmino Mix (100B) is the mixture of data used for the third stage of training for Olmo 3 32B model.

Dataset Sources

Source Type
LC-s2pdf-REX 32k-64k Synth PDFs
LC-s2pdf-CWE 32k-64k Synth PDFs
LC-s2pdf 32k-64k PDFs
LC-s2pdf 8k-32k (8-16k) PDFs
LC-s2pdf 8k-32k (16-32k) PDFs
Midtraining Data Mix

Licensing Information

Dolma 3 Longmino is licensed under the Open Data Commons Attribution License v1.0 (ODC-By). It is intended for research and educational use. For more information, please see our Responsible Use Guidelines.

Citation

text
@misc{olmo2025olmo3,
title={Olmo 3},
author={Team Olmo and Allyson Ettinger and Amanda Bertsch and Bailey Kuehl and David Graham and David Heineman and Dirk Groeneveld and Faeze Brahman and Finbarr Timbers and Hamish Ivison and Jacob Morrison and Jake Poznanski and Kyle Lo and Luca Soldaini and Matt Jordan and Mayee Chen and Michael Noukhovitch and Nathan Lambert and Pete Walsh and Pradeep Dasigi and Robert Berry and Saumya Malik and Saurabh Shah and Scott Geng and Shane Arora and Shashank Gupta and Taira Anderson and Teng Xiao and Tyler Murray and Tyler Romero and Victoria Graf and Akari Asai and Akshita Bhagia and Alexander Wettig and Alisa Liu and Aman Rangapur and Chloe Anastasiades and Costa Huang and Dustin Schwenk and Harsh Trivedi and Ian Magnusson and Jaron Lochner and Jiacheng Liu and Lester James V. Miranda and Maarten Sap and Malia Morgan and Michael Schmitz and Michal Guerquin and Michael Wilson and Regan Huff and Ronan Le Bras and Rui Xin and Rulin Shao and Sam Skjonsberg and Shannon Zejiang Shen and Shuyue Stella Li and Tucker Wilde and Valentina Pyatkin and Will Merrill and Yapei Chang and Yuling Gu and Zhiyuan Zeng and Ashish Sabharwal and Luke Zettlemoyer and Pang Wei Koh and Ali Farhadi and Noah A. Smith and Hannaneh Hajishirzi},
year={2025},
eprint={2512.13961},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2512.13961},
}

Social Proof

HuggingFace Hub
73.1KDownloads
🔄 Daily sync (03:00 UTC)

AI Summary: Based on Hugging Face metadata. Not a recommendation.

📊 FNI Methodology 📚 Knowledge Baseâ„šī¸ Verify with original source

đŸ›Ąī¸ Dataset Transparency Report

Technical metadata sourced from upstream repositories.

Open Metadata

🆔 Identity & Source

id
hf-dataset--allenai--dolma3_longmino_mix-100b-1125
slug
allenai--dolma3_longmino_mix-100b-1125
source
huggingface
author
allenai
license
odc-by
tags
language:en, license:odc-by, arxiv:2512.13961, region:us

âš™ī¸ Technical Specs

architecture
null
params billions
100
context length
4,096
pipeline tag

📊 Engagement & Metrics

downloads
73,128
stars
15
forks
0

Data indexed from public sources. Updated daily.