calculator_model_test
This model is a fine-tuned version of on the None dataset.
It achieves the following results on the evaluation set:
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 512
- eval_batch_size: 512
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 40
Training results
| Training Loss |
Epoch |
Step |
Validation Loss |
| 3.4635 |
1.0 |
5 |
2.8650 |
| 2.5314 |
2.0 |
10 |
2.1158 |
| 1.9819 |
3.0 |
15 |
1.8063 |
| 1.6986 |
4.0 |
20 |
1.6031 |
| 1.5730 |
5.0 |
25 |
1.5538 |
| 1.5102 |
6.0 |
30 |
1.5036 |
| 1.4481 |
7.0 |
35 |
1.4591 |
| 1.3779 |
8.0 |
40 |
1.3530 |
| 1.3051 |
9.0 |
45 |
1.2608 |
| 1.2610 |
10.0 |
50 |
1.1978 |
| 1.2074 |
11.0 |
55 |
1.2343 |
| 1.1567 |
12.0 |
60 |
1.1180 |
| 1.1375 |
13.0 |
65 |
1.1520 |
| 1.1428 |
14.0 |
70 |
1.1030 |
| 1.0972 |
15.0 |
75 |
1.0581 |
| 1.0503 |
16.0 |
80 |
0.9979 |
| 0.9758 |
17.0 |
85 |
0.9513 |
| 0.9473 |
18.0 |
90 |
0.9317 |
| 0.9206 |
19.0 |
95 |
0.9380 |
| 0.9384 |
20.0 |
100 |
0.8643 |
| 0.8769 |
21.0 |
105 |
0.9630 |
| 0.9673 |
22.0 |
110 |
0.9533 |
| 0.9098 |
23.0 |
115 |
0.8435 |
| 0.8675 |
24.0 |
120 |
0.8262 |
| 0.8382 |
25.0 |
125 |
0.8295 |
| 0.8148 |
26.0 |
130 |
0.7936 |
| 0.8002 |
27.0 |
135 |
0.7727 |
| 0.7794 |
28.0 |
140 |
0.7617 |
| 0.7631 |
29.0 |
145 |
0.7373 |
| 0.7419 |
30.0 |
150 |
0.7182 |
| 0.7297 |
31.0 |
155 |
0.7168 |
| 0.7208 |
32.0 |
160 |
0.6962 |
| 0.7054 |
33.0 |
165 |
0.6853 |
| 0.6964 |
34.0 |
170 |
0.6826 |
| 0.6895 |
35.0 |
175 |
0.6700 |
| 0.6787 |
36.0 |
180 |
0.6599 |
| 0.6689 |
37.0 |
185 |
0.6539 |
| 0.6651 |
38.0 |
190 |
0.6495 |
| 0.6646 |
39.0 |
195 |
0.6490 |
| 0.6592 |
40.0 |
200 |
0.6463 |
Framework versions
- Transformers 5.0.0
- Pytorch 2.10.0+cu128
- Datasets 4.0.0
- Tokenizers 0.22.2