Proceedings of the 32nd International Conference on Machine Learning , pages =

· 2015

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

browse 3 citing papers

representative citing papers

Vanishing L2 regularization for the softmax Multi Armed Bandit

cs.LG · 2026-05-05 · unverdicted · novelty 6.0

Vanishing L2 regularization yields provable convergence for softmax MAB policies and improves empirical performance.

Training Language Models to Self-Correct via Reinforcement Learning

cs.LG · 2024-09-19 · unverdicted · novelty 6.0

SCoRe uses multi-turn online RL with regularization on self-generated traces to improve LLM self-correction, achieving 15.6% and 9.1% gains on MATH and HumanEval for Gemini models.

Mathematical Reasoning in Large Language Models: Benchmarks, Architectures, Evaluation, and Open Challenges

cs.CL · 2026-05-19 · unverdicted · novelty 3.0

A literature survey synthesizing benchmarks, architectures, training strategies, and evaluation methods for mathematical reasoning in LLMs, based on roughly 120 papers.

citing papers explorer

Showing 3 of 3 citing papers.

Vanishing L2 regularization for the softmax Multi Armed Bandit cs.LG · 2026-05-05 · unverdicted · none · ref 32
Vanishing L2 regularization yields provable convergence for softmax MAB policies and improves empirical performance.
Training Language Models to Self-Correct via Reinforcement Learning cs.LG · 2024-09-19 · unverdicted · none · ref 280
SCoRe uses multi-turn online RL with regularization on self-generated traces to improve LLM self-correction, achieving 15.6% and 9.1% gains on MATH and HumanEval for Gemini models.
Mathematical Reasoning in Large Language Models: Benchmarks, Architectures, Evaluation, and Open Challenges cs.CL · 2026-05-19 · unverdicted · none · ref 87
A literature survey synthesizing benchmarks, architectures, training strategies, and evaluation methods for mathematical reasoning in LLMs, based on roughly 120 papers.

Proceedings of the 32nd International Conference on Machine Learning , pages =

fields

years

verdicts

representative citing papers

citing papers explorer