Sharpness-aware black-box optimization.arXiv preprint arXiv:2410.12457,

Ye, F · arXiv 2410.12457

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

cs.LG · 2026-02-01 · unverdicted · novelty 6.0

ESSAM matches PPO and GRPO accuracy (~78%) on GSM8K math tasks but uses 10-18x less GPU memory and shows stronger generalization across datasets.

Showing 1 of 1 citing paper.

ESSAM: A Novel Competitive Evolution Strategies Approach to Reinforcement Learning for Memory Efficient LLMs Fine-Tuning cs.LG · 2026-02-01 · unverdicted · none · ref 9
ESSAM matches PPO and GRPO accuracy (~78%) on GSM8K math tasks but uses 10-18x less GPU memory and shows stronger generalization across datasets.