CardioBERTpt - Portuguese Transformer-based Models for Clinical Language Representation in Cardiology
This model card describes CardioBERTpt, a clinical model trained on the cardiology domain for NER tasks in Portuguese. This model is a fine-tuned version of bert-base-multilingual-cased on a cardiology text dataset.
It achieves the following results on the evaluation set:
Loss: 0.4495
Accuracy: 0.8864
How to use the model
Load the model via the transformers library:
text
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("pucpr-br/cardiobertpt")
model = AutoModel.from_pretrained("pucpr-br/cardiobertpt")
Training hyperparameters
The following hyperparameters were used during training:
learning_rate: 1e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
This study was financed in part by the CoordenaΓ§Γ£o de AperfeiΓ§oamento de Pessoal de NΓvel Superior - Brasil (CAPES) - Finance Code 001, and by Foxconn Brazil and Zerbini Foundation as part of the research project Machine Learning in Cardiovascular Medicine.
Citation
text
@INPROCEEDINGS{10178779,
author={Schneider, Elisa Terumi Rubel and Gumiel, Yohan Bonescki and de Souza, JoΓ£o Vitor Andrioli and Mie Mukai, Lilian and Emanuel Silva e Oliveira, Lucas and de Sa Rebelo, Marina and Antonio Gutierrez, Marco and Eduardo Krieger, Jose and Teodoro, Douglas and Moro, Claudia and Paraiso, Emerson Cabrera},
booktitle={2023 IEEE 36th International Symposium on Computer-Based Medical Systems (CBMS)},
title={CardioBERTpt: Transformer-based Models for Cardiology Language Representation in Portuguese},
year={2023},
volume={},
number={},
pages={378-381},
doi={10.1109/CBMS58004.2023.00247}}
}