📊

Dataset

Cmu Arctic Xvectors

by Matthijs hf-dataset--matthijs--cmu-arctic-xvectors

Nexus Index

22.0 Top 3%

P / V / C / U Breakdown Calibration Pending

Pillar scores are computed during the next indexing cycle.

Tech Context

Vital Performance

0 DL / 30D

0.0%

There is one file for each utterance in the dataset, 7931 files in total. The speaker embeddings are 512-element X-vectors. The CMU ARCTIC dataset divides the utterances among the following speakers: - bdl (US male) - slt (US female) - jmk (Canadian male) - awb (Scottish male) - rms (US male) - clb (US female) - ksp (Indian male) The X-vectors were extracted using this script, which uses ...

Source →

Data Integrity 22 FNI Score

- Size

- Rows

Parquet Format

- Tokens

Dataset Information Summary
Entity Passport
Registry ID	hf-dataset--matthijs--cmu-arctic-xvectors
Provider	huggingface

📜

Cite this dataset

Academic & Research Attribution

BibTeX

@misc{hf_dataset__matthijs__cmu_arctic_xvectors,
  author = {Matthijs},
  title = {Cmu Arctic Xvectors Dataset},
  year = {2026},
  howpublished = {\url{https://huggingface.co/datasets/Matthijs/cmu-arctic-xvectors}},
  note = {Accessed via Free2AITools Knowledge Fortress}
}

APA Style

Matthijs. (2026). Cmu Arctic Xvectors [Dataset]. Free2AITools. https://huggingface.co/datasets/Matthijs/cmu-arctic-xvectors

🔬Technical Deep Dive

Full Specifications [+]

⚖️ Nexus Index V16.5

Methodology Index Protocol

22.0

ESTIMATED IMPACT TIER

Popularity (P) 0

Freshness (F) 0

Completeness (C) 0

Utility (U) 0

💬 Index Insight

The Free2AITools Nexus Index for Cmu Arctic Xvectors aggregates Popularity (P:0), Freshness (F:0), and Completeness (C:0). The Utility score (U:0) represents deployment readiness and ecosystem adoption.

Free2AITools Nexus Index

Verification Authority

HuggingFace API GitHub Metadata Arxiv Citation DB System Audit

Unbiased Data Node Refresh: VFS Live

⬇️

Downloads

20,673

❤️

Likes

👁️ Data Preview

📊

Row-level preview not available for this dataset.

Schema structure is shown in the Field Logic panel when available.

🔗 Explore Full Dataset ↗

🧬 Field Logic

🧬

Schema not yet indexed for this dataset.

Dataset Specification

pretty_name: CMU ARCTIC X-Vectors
task_categories:

text-to-speech
audio-to-audio
license: mit

Speaker embeddings extracted from CMU ARCTIC

There is one .npy file for each utterance in the dataset, 7931 files in total. The speaker embeddings are 512-element X-vectors.

The CMU ARCTIC dataset divides the utterances among the following speakers:

bdl (US male)
slt (US female)
jmk (Canadian male)
awb (Scottish male)
rms (US male)
clb (US female)
ksp (Indian male)

The X-vectors were extracted using this script, which uses the speechbrain/spkrec-xvect-voxceleb model.

Usage:

from datasets import load_dataset
embeddings_dataset = load_dataset("Matthijs/cmu-arctic-xvectors", split="validation")

speaker_embeddings = embeddings_dataset[7306]["xvector"]
speaker_embeddings = torch.tensor(speaker_embeddings).unsqueeze(0)

Top Tier

Social Proof

HuggingFace Hub

62Likes

20.7KDownloads

Hub Discussions

🤗 Data Source: Hugging Face ↗

🔄 Daily sync (03:00 UTC)

AI Summary: Based on Hugging Face metadata. Not a recommendation.

📊 FNI Methodology 📚 Knowledge Baseℹ️ Verify with original source

🛡️ Dataset Transparency Report

Verified data manifest for traceability and transparency.

100% Data Disclosure Active

🆔 Identity & Source

id: hf-dataset--matthijs--cmu-arctic-xvectors
source: huggingface
author: Matthijs
tags: task_categories:text-to-speechtask_categories:audio-to-audiolicense:mitsize_categories:1kmodality:textmodality:timeserieslibrary:datasetslibrary:mlcroissantregion:us

⚙️ Technical Specs

architecture: null
params billions: null
context length: null

📊 Engagement & Metrics

likes: 62
downloads: 20,673

Free2AITools Constitutional Data Pipeline: Curated disclosure mode active. (V15.x Standard)

Welcome to Free2AI Tools!

Smart Search

FNI Score

You're All Set!