📊
Dataset

Mavos Dd

by J Shen3 hf-dataset--j-shen3--mavos-dd
Free2AITools Nexus Index
59.7 Top 100%
S: Semantic 50
A: Authority 61
P: Popularity 51
R: Recency 95
Q: Quality 50
Tech Context
Vital Performance
0 DL / 30D
0.0%
Data Integrity 59.7 FNI Score
- Size
- Rows
Parquet Format
- Tokens
Dataset Information Summary
Entity Passport
Registry ID hf-dataset--j-shen3--mavos-dd
Provider huggingface
📜

Cite this dataset

Academic & Research Attribution

BibTeX
@misc{hf_dataset__j_shen3__mavos_dd,
  author = {J Shen3},
  title = {Mavos Dd Dataset},
  year = {2026},
  howpublished = {\url{https://huggingface.co/datasets/j-shen3/MAVOS-DD}},
  note = {Accessed via Free2AITools Knowledge Fortress}
}
APA Style
J Shen3. (2026). Mavos Dd [Dataset]. Free2AITools. https://huggingface.co/datasets/j-shen3/MAVOS-DD

đŸ”ŦTechnical Deep Dive

Full Specifications [+]

âš–ī¸ Free2AITools Nexus Index V2.0

Semantic (S) 50
Authority (A) 61
Popularity (P) 51
Recency (R) 95
Quality (Q) 50

đŸ’Ŧ Index Insight

FNI V2.0 for Mavos Dd: Semantic (S:50), Authority (A:61), Popularity (P:51), Recency (R:95), Quality (Q:50).

Free2AITools Nexus Index

Verification Authority

Unbiased Data Node Refresh: VFS Live
âŦ‡ī¸
Downloads
31,672

đŸ‘ī¸ Data Preview

📊

Row-level preview not available for this dataset.

Schema structure is shown in the Field Logic panel when available.

🔗 Explore Full Dataset ↗

đŸ§Ŧ Field Logic

đŸ§Ŧ

Schema not yet indexed for this dataset.

Dataset Specification

LICENSE: This dataset is released under the CC BY-NC-SA 4.0 license.

This repository contains MAVOS-DD an open-set benchmark for multilingual audio-video deepfake detection.

Below, you can find the code to obtain the subsets described in the paper: train, validation, open-set model, open-set language and open-set full:

{python}
from datasets import Dataset, concatenate_datasets
metadata = Dataset.load_from_disk('MAVOS-DD')
metadata_indomain = metadata.filter(lambda sample: sample['split']=='test' and not sample['open_set_model'] and not sample['open_set_language'])
metadata_open_model = metadata.filter(lambda sample: sample['split']=='test' and sample['open_set_model'] and not sample['open_set_language'])
metadata_open_model = concatenate_datasets([metadata_indomain, metadata_open_model])
metadata_open_language = metadata.filter(lambda sample: sample['split']=='test' and not sample['open_set_model'] and sample['open_set_language'])
metadata_open_model = concatenate_datasets([metadata_indomain, metadata_open_language])
metadata_all = metadata.filter(lambda sample: sample['split']=='test')

The scripts require the datasets package to be installed.

{bash}
pip install datasets

We provide two scripts: metadata_generation.py and dataset.py. The metadata_generation.py script is responsible for generating the metadata. Below is a sample metadata entry:

{bash}
Sample: {'video_path': 'arabic/inswapper/02690.png_Po82BhllEjA_340_1.mp4.mp4', 'label': 'fake', 'split': 'train', 'open_set_model': False, 'open_set_language': False, 'language': 'arabic', 'generative_method': 'inswapper'}

The dataset.py script includes examples of how to read and filter this metadata.

The code for running the baseline models can be found here: https://github.com/CroitoruAlin/MAVOS-DD

Note: Our dataset was collected from publicly available YouTube videos. If any individual wishes to request the removal of content involving them, please contact us at [email protected].

Citation:

{bash}
@misc{Croitoru-ArXiv-2025,
      title={MAVOS-DD: Multilingual Audio-Video Open-Set Deepfake Detection Benchmark}, 
      author={Florinel-Alin Croitoru and Vlad Hondru and Marius Popescu and Radu Tudor Ionescu and Fahad Shahbaz Khan and Mubarak Shah},
      year={2025},
      eprint={2505.11109},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2505.11109}, 
}

📊 Structured Schema (Zero-Fabrication)

Feature Key Data Type
video Video
label ClassLabel

Estimated Rows: 87,038

Social Proof

HuggingFace Hub
31.7KDownloads
🔄 Daily sync (03:00 UTC)

AI Summary: Based on Hugging Face metadata. Not a recommendation.

📊 FNI Methodology 📚 Knowledge Baseâ„šī¸ Verify with original source

đŸ›Ąī¸ Dataset Transparency Report

Technical metadata sourced from upstream repositories.

Open Metadata

🆔 Identity & Source

id
hf-dataset--j-shen3--mavos-dd
slug
j-shen3--mavos-dd
source
huggingface
author
J Shen3
license
tags
task_categories:video-classification, language:ar, language:ro, language:en, language:de, language:hi, language:es, language:ru, size_categories:10k<n<100k, modality:video, library:datasets, library:mlcroissant, arxiv:2505.11109, region:us

âš™ī¸ Technical Specs

architecture
null
params billions
null
context length
null
pipeline tag

📊 Engagement & Metrics

downloads
31,672
stars
0
forks
null

Data indexed from public sources. Updated daily.