🧠 Model

XTTS-v2

by coqui

--- license: other license_name: coqui-public-model-license license_link: https://coqui.ai/cpml library_name: coqui pipeline_tag: text-to-speech widget: - text:

πŸ• Updated 12/19/2025

🧠 Architecture Explorer

Neural network architecture

1 Input Layer
2 Hidden Layers
3 Attention
4 Output Layer

About

ⓍTTS is a Voice generation model that lets you clone voices into different languages by using just a quick 6-second audio clip. There is no need for an excessive amount of training data that spans countless hours. This is the same or similar model to what powers Coqui Stu...

πŸ“ Limitations & Considerations

  • β€’ Benchmark scores may vary based on evaluation methodology and hardware configuration.
  • β€’ VRAM requirements are estimates; actual usage depends on quantization and batch size.
  • β€’ FNI scores are relative rankings and may change as new models are added.
  • β€’ Data source: [{"source_platform":"huggingface","source_url":"https://huggingface.co/coqui/XTTS-v2","fetched_at":"2025-12-19T07:41:01.175Z","adapter_version":"3.2.0"}]

πŸ“š Related Resources

πŸ“„ Related Papers

No related papers linked yet. Check the model's official documentation for research papers.

πŸ“Š Training Datasets

Training data information not available. Refer to the original model card for details.

πŸ”— Related Models

Data unavailable

πŸš€ What's Next?