πŸ“Š
Dataset

Common Corpus

by PleIAs hf-dataset--pleias--common_corpus
Nexus Index
41.0 Top 100%
S: Semantic 50
A: Authority 0
P: Popularity 63
R: Recency 71
Q: Quality 30
Tech Context
Vital Performance
0 DL / 30D
0.0%
Data Integrity 41 FNI Score
- Size
- Rows
Parquet Format
- Tokens
Dataset Information Summary
Entity Passport
Registry ID hf-dataset--pleias--common_corpus
Provider huggingface
πŸ“œ

Cite this dataset

Academic & Research Attribution

BibTeX
@misc{hf_dataset__pleias__common_corpus,
  author = {PleIAs},
  title = {Common Corpus Dataset},
  year = {2026},
  howpublished = {\url{https://huggingface.co/datasets/pleias/common_corpus}},
  note = {Accessed via Free2AITools Knowledge Fortress}
}
APA Style
PleIAs. (2026). Common Corpus [Dataset]. Free2AITools. https://huggingface.co/datasets/pleias/common_corpus

πŸ”¬Technical Deep Dive

Full Specifications [+]

βš–οΈ Nexus Index V2.0

41.0
TOP 100% SYSTEM IMPACT
Semantic (S) 50
Authority (A) 0
Popularity (P) 63
Recency (R) 71
Quality (Q) 30

πŸ’¬ Index Insight

FNI V2.0 for Common Corpus: Semantic (S:50), Authority (A:0), Popularity (P:63), Recency (R:71), Quality (Q:30).

Free2AITools Nexus Index

Verification Authority

Unbiased Data Node Refresh: VFS Live
⬇️
Downloads
261,284

πŸ‘οΈ Data Preview

πŸ“Š

Row-level preview not available for this dataset.

Schema structure is shown in the Field Logic panel when available.

πŸ”— Explore Full Dataset β†—

🧬 Field Logic

🧬

Schema not yet indexed for this dataset.

Dataset Specification

Social Proof

HuggingFace Hub
261.3KDownloads
πŸ”„ Daily sync (03:00 UTC)

AI Summary: Based on Hugging Face metadata. Not a recommendation.

πŸ“Š FNI Methodology πŸ“š Knowledge Baseℹ️ Verify with original source

πŸ›‘οΈ Dataset Transparency Report

Technical metadata sourced from upstream repositories.

Open Metadata

πŸ†” Identity & Source

id
hf-dataset--pleias--common_corpus
slug
pleias--common_corpus
source
huggingface
author
PleIAs
license
tags
language:en, language:fr, language:de, language:it, language:es, language:la, language:nl, language:pl, size_categories:100m<n<1b, format:parquet, modality:tabular, modality:text, library:datasets, library:dask, library:polars, library:mlcroissant, arxiv:2506.01732, arxiv:2410.22587, region:us, language:zh, language:ja, language:ru, language:ar, language:ko, size_categories:10k<n<100k, library:pandas

βš™οΈ Technical Specs

architecture
null
params billions
null
context length
null
pipeline tag

πŸ“Š Engagement & Metrics

downloads
261,284
stars
394
forks
0

Data indexed from public sources. Updated daily.