📊
Dataset

Cmu Arctic Xvectors

by Matthijs hf-dataset--matthijs--cmu-arctic-xvectors
Nexus Index
22.0 Top 3%
P / V / C / U Breakdown Calibration Pending

Pillar scores are computed during the next indexing cycle.

Tech Context
Vital Performance
0 DL / 30D
0.0%

There is one file for each utterance in the dataset, 7931 files in total. The speaker embeddings are 512-element X-vectors. The CMU ARCTIC dataset divides the utterances among the following speakers: - bdl (US male) - slt (US female) - jmk (Canadian male) - awb (Scottish male) - rms (US male) - clb (US female) - ksp (Indian male) The X-vectors were extracted using this script, which uses ...

Data Integrity 22 FNI Score
- Size
- Rows
Parquet Format
- Tokens
Dataset Information Summary
Entity Passport
Registry ID hf-dataset--matthijs--cmu-arctic-xvectors
Provider huggingface
📜

Cite this dataset

Academic & Research Attribution

BibTeX
@misc{hf_dataset__matthijs__cmu_arctic_xvectors,
  author = {Matthijs},
  title = {Cmu Arctic Xvectors Dataset},
  year = {2026},
  howpublished = {\url{https://huggingface.co/datasets/Matthijs/cmu-arctic-xvectors}},
  note = {Accessed via Free2AITools Knowledge Fortress}
}
APA Style
Matthijs. (2026). Cmu Arctic Xvectors [Dataset]. Free2AITools. https://huggingface.co/datasets/Matthijs/cmu-arctic-xvectors

đŸ”ŦTechnical Deep Dive

Full Specifications [+]

âš–ī¸ Nexus Index V16.5

22.0
ESTIMATED IMPACT TIER
Popularity (P) 0
Freshness (F) 0
Completeness (C) 0
Utility (U) 0

đŸ’Ŧ Index Insight

The Free2AITools Nexus Index for Cmu Arctic Xvectors aggregates Popularity (P:0), Freshness (F:0), and Completeness (C:0). The Utility score (U:0) represents deployment readiness and ecosystem adoption.

Free2AITools Nexus Index

Verification Authority

Unbiased Data Node Refresh: VFS Live
âŦ‡ī¸
Downloads
20,673
â¤ī¸
Likes
62

đŸ‘ī¸ Data Preview

📊

Row-level preview not available for this dataset.

Schema structure is shown in the Field Logic panel when available.

🔗 Explore Full Dataset ↗

đŸ§Ŧ Field Logic

đŸ§Ŧ

Schema not yet indexed for this dataset.

Dataset Specification


pretty_name: CMU ARCTIC X-Vectors
task_categories:

  • text-to-speech
  • audio-to-audio
    license: mit

Speaker embeddings extracted from CMU ARCTIC

There is one .npy file for each utterance in the dataset, 7931 files in total. The speaker embeddings are 512-element X-vectors.

The CMU ARCTIC dataset divides the utterances among the following speakers:

  • bdl (US male)
  • slt (US female)
  • jmk (Canadian male)
  • awb (Scottish male)
  • rms (US male)
  • clb (US female)
  • ksp (Indian male)

The X-vectors were extracted using this script, which uses the speechbrain/spkrec-xvect-voxceleb model.

Usage:

from datasets import load_dataset
embeddings_dataset = load_dataset("Matthijs/cmu-arctic-xvectors", split="validation")

speaker_embeddings = embeddings_dataset[7306]["xvector"] speaker_embeddings = torch.tensor(speaker_embeddings).unsqueeze(0)

Top Tier

Social Proof

HuggingFace Hub
62Likes
20.7KDownloads
🔄 Daily sync (03:00 UTC)

AI Summary: Based on Hugging Face metadata. Not a recommendation.

📊 FNI Methodology 📚 Knowledge Baseâ„šī¸ Verify with original source

đŸ›Ąī¸ Dataset Transparency Report

Verified data manifest for traceability and transparency.

100% Data Disclosure Active

🆔 Identity & Source

id
hf-dataset--matthijs--cmu-arctic-xvectors
source
huggingface
author
Matthijs
tags
task_categories:text-to-speechtask_categories:audio-to-audiolicense:mitsize_categories:1kmodality:textmodality:timeserieslibrary:datasetslibrary:mlcroissantregion:us

âš™ī¸ Technical Specs

architecture
null
params billions
null
context length
null

📊 Engagement & Metrics

likes
62
downloads
20,673

Free2AITools Constitutional Data Pipeline: Curated disclosure mode active. (V15.x Standard)