Fun-CosyVoice3-0.5B-2512-GGUF
Unofficial community GGUF release for CosyVoice3-2512.
This model pack is intended for use with cosyvoice.cpp.
Important Notice
- This is not an official CosyVoice release.
- The implementation is maintained by an independent community developer.
- The C++ runtime project (
cosyvoice.cpp) is MIT.
- Model artifacts in this pack remain Apache-2.0.
Available Files
Current generated variants:
| File |
Quantization |
Size (approx) |
Notes |
CosyVoice3-2512_F32.gguf |
F32 |
3.21 GiB |
Highest precision, largest size |
CosyVoice3-2512_F16.gguf |
F16 |
1.61 GiB |
High-quality baseline |
CosyVoice3-2512_Q8_0.gguf |
Q8_0 |
0.88 GiB |
Near-F16 quality, smaller |
CosyVoice3-2512_Q6_K_S.gguf |
Q6_K_S |
0.78 GiB |
Good quality/size balance |
CosyVoice3-2512_Q5_1.gguf |
Q5_1 |
0.64 GiB |
Better quality than Q5_0 |
CosyVoice3-2512_Q5_K_S.gguf |
Q5_K_S |
0.61 GiB |
K-quant alternative |
CosyVoice3-2512_Q5_0.gguf |
Q5_0 |
0.59 GiB |
Compact mid-quality |
CosyVoice3-2512_Q4_1.gguf |
Q4_1 |
0.54 GiB |
Smaller size, audible pronunciation artifacts |
CosyVoice3-2512_Q4_K_S.gguf |
Q4_K_S |
0.54 GiB |
Compact, but speech quality artifacts are common |
CosyVoice3-2512_Q4_0.gguf |
Q4_0 |
0.49 GiB |
Further quality degradation, often unclear speech |
CosyVoice3-2512_Q3_K_S.gguf |
Q3_K_S |
0.44 GiB |
Aggressive quantization |
CosyVoice3-2512_Q2_K_S.gguf |
Q2_K_S |
0.40 GiB |
Smallest, strongest quality trade-off |
Quick Recommendations
- Default recommendation:
Q8_0 (near-lossless listening quality in current tests, with much smaller size than F16)
- High-quality alternatives:
F16 and Q6_K_S
- Usable mid-tier options:
Q5_1, Q5_K_S, Q5_0
- Not recommended for quality-sensitive use:
Q4_1, Q4_K_S, Q4_0, Q3_K_S, Q2_K_S
Subjective listening notes from current tests:
Q8_0: quality remains strong with little audible loss in typical samples.
Q6_K_S: still sounds good and is often a practical choice.
Q5 family (Q5_1 / Q5_K_S / Q5_0): generally usable with moderate degradation.
Q4 family: audible pronunciation artifacts are common.
Q3_K_S: quality drops further.
Q2_K_S: mostly degrades into noise and is usually unusable.
Runtime Requirements
- Built for
cosyvoice.cpp GGUF inference pipeline.
- For CPU mode, parts of the runtime currently require x86 AVX2 support.
- GPU backend behavior depends on GGML backend and driver stack.
Basic Usage
bash
cosyvoice-cli \
--model CosyVoice3-2512_Q8_0.gguf \
--prompt-speech prompt_speech.gguf \
--text "Hello from CosyVoice" \
--output out.wav
--prompt-speech expects a prompt-speech file in GGUF format (for example prompt_speech.gguf).
For runtime usage and documentation, see the cosyvoice.cpp repository:
Provenance
- This work follows the architecture and inference behavior of upstream CosyVoice releases, then re-implements the pipeline for C++/GGML deployment.
- Tokenizer implementation is adapted from
llama.cpp and modified for this project.
License
- Model files in this repository: Apache-2.0
- Runtime code (
cosyvoice.cpp): MIT