ArgBench unifies 33 existing datasets into a standardized benchmark for testing LLMs across 46 argumentation tasks and analyzes the impact of prompting techniques and model factors on performance.
arXiv preprint arXiv:2404.00459 , year=
3 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 3representative citing papers
BitTokens represent numbers as single tokens via IEEE 754 binary format, allowing small language models to learn basic arithmetic algorithms nearly perfectly.
Triadic Suffix Tokenization groups digits into triads with fixed magnitude suffixes to make order-of-magnitude relationships explicit at the token level for LLMs.
citing papers explorer
-
ArgBench: Benchmarking LLMs on Computational Argumentation Tasks
ArgBench unifies 33 existing datasets into a standardized benchmark for testing LLMs across 46 argumentation tasks and analyzes the impact of prompting techniques and model factors on performance.
-
Efficient numeracy in language models through single-token number embeddings
BitTokens represent numbers as single tokens via IEEE 754 binary format, allowing small language models to learn basic arithmetic algorithms nearly perfectly.
-
A Triadic Suffix Tokenization Scheme for Numerical Reasoning
Triadic Suffix Tokenization groups digits into triads with fixed magnitude suffixes to make order-of-magnitude relationships explicit at the token level for LLMs.