Basque LLM Evaluation
A public benchmark dashboard for comparing local LLM performance in Basque language tasks.
Table 1. Main comparative results
#
Model
Quantization
Overall
Evals (grouped)
N
Accuracy reported as mean ± std across random seeds.
Figures
Figure 1. Overall accuracy by model
Figure 2. Accuracy profile by eval
Figure 3. Overall accuracy by release date
Evaluation protocol
Family
Benchmark
What is measured
Metric
Label space