π§ Model
XTTS-v2
by coqui
--- license: other license_name: coqui-public-model-license license_link: https://coqui.ai/cpml library_name: coqui pipeline_tag: text-to-speech widget: - text:
π Updated 12/19/2025
π§ Architecture Explorer
Neural network architecture
1 Input Layer
2 Hidden Layers
3 Attention
4 Output Layer
About
βTTS is a Voice generation model that lets you clone voices into different languages by using just a quick 6-second audio clip. There is no need for an excessive amount of training data that spans countless hours. This is the same or similar model to what powers Coqui Stu...
π Limitations & Considerations
- β’ Benchmark scores may vary based on evaluation methodology and hardware configuration.
- β’ VRAM requirements are estimates; actual usage depends on quantization and batch size.
- β’ FNI scores are relative rankings and may change as new models are added.
- β’ Data source: [{"source_platform":"huggingface","source_url":"https://huggingface.co/coqui/XTTS-v2","fetched_at":"2025-12-19T07:41:01.175Z","adapter_version":"3.2.0"}]
π Related Resources
π Related Papers
No related papers linked yet. Check the model's official documentation for research papers.
π Training Datasets
Training data information not available. Refer to the original model card for details.
π Related Models
Data unavailable