Benchmarking 25 LLMs on Raspberry Pi hardware shows Granite4 Tiny Hybrid (7B) balances 2.5 tokens/s, 0.90 tokens/J, and 54.6% MMLU while teaching effectiveness does not require high general knowledge scores.
GLUE: A multi-task benchmark and analysis platform for natural language understanding,
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
verdicts
UNVERDICTED 2representative citing papers
FDA-Opt unifies and improves upon FedOpt and FDA for communication-efficient federated fine-tuning of language models on NLP tasks, outperforming optimized FedOpt baselines.
citing papers explorer
-
Benchmarking Local Language Models for Social Robots using Edge Devices
Benchmarking 25 LLMs on Raspberry Pi hardware shows Granite4 Tiny Hybrid (7B) balances 2.5 tokens/s, 0.90 tokens/J, and 54.6% MMLU while teaching effectiveness does not require high general knowledge scores.
-
Communication-Efficient Federated Fine-Tuning
FDA-Opt unifies and improves upon FedOpt and FDA for communication-efficient federated fine-tuning of language models on NLP tasks, outperforming optimized FedOpt baselines.