Release contents were curated from the run's final/ directory plus processor artifacts.
This release is intended for inference and reproducible loading from the Hugging Face Hub.
Evaluation
Validation metrics from eval_results.json:
WER: 0.21364559684550125
CER: 0.15108458842474187
Epoch: 31.380753138075313
Usage
python
from transformers import AutoModelForCTC, AutoProcessor, pipeline
repo_id = "FormosanBank/xls-r-53-stage1-seediq-asr"
processor = AutoProcessor.from_pretrained(repo_id)
model = AutoModelForCTC.from_pretrained(repo_id)
pipe = pipeline(
"automatic-speech-recognition",
model=model,
tokenizer=processor.tokenizer,
feature_extractor=processor.feature_extractor,
)
result = pipe("path/to/audio.wav")
print(result["text"])
Limitations
This model is intended for Formosan language ASR research and educational use.
Performance can vary by corpus, speaker, recording quality, orthography conventions,
and domain mismatch.
If you use this model, please cite FormosanBank itself:
bibtex
@misc{mohamed2024formosanbank,
author = {Mohamed, W. and Le Ferrand, Γ. and Sung, L.-M. and Prud'hommeaux, E. and Hartshorne, J. K.},
title = {FormosanBank},
year = {2024},
note = {Electronic Resource},
url = {https://ai4commsci.gitbook.io/formosanbank}
}
License and attribution
FormosanBank annotations and metadata are licensed under CC-BY-4.0.
You must cite the source in any redistributed or derived products.
For code packages, you may refer to the GitHub repository.
For academic publications, cite the FormosanBank electronic resource above.
β οΈ Incomplete Data
Some information about this model is not available.
Use with Caution - Verify details from the original source before relying on this data.