Commercial models continue to achieve the strongest performance, but the gap with open- weight models is steadily narrowing as architec- tures and training strategies improve

4B achieves 26

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

BURMESE-SAN: Burmese NLP Benchmark for Evaluating Large Language Models

cs.CL · 2026-02-21 · unverdicted · novelty 7.0

BURMESE-SAN creates the first comprehensive Burmese NLP benchmark with seven subtasks and shows architecture, representation, and instruction tuning outweigh model scale for performance.

citing papers explorer

Showing 1 of 1 citing paper.

BURMESE-SAN: Burmese NLP Benchmark for Evaluating Large Language Models cs.CL · 2026-02-21 · unverdicted · none · ref 3
BURMESE-SAN creates the first comprehensive Burmese NLP benchmark with seven subtasks and shows architecture, representation, and instruction tuning outweigh model scale for performance.

Commercial models continue to achieve the strongest performance, but the gap with open- weight models is steadily narrowing as architec- tures and training strategies improve

fields

years

verdicts

representative citing papers

citing papers explorer