LOSCAR-SGD combines local updates, sparse model averaging, and communication-computation overlap with a delay-corrected merge rule, providing convergence rates for smooth non-convex objectives under worker heterogeneity.
hub
Journal of Machine Learning Research , volume=
20 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
roles
background 2polarities
background 2representative citing papers
Ringmaster LMO extends delay-thresholding from ASGD to LMO-based momentum updates, providing convergence guarantees under (L0, L1)-smoothness and time-complexity bounds that recover optimal rates in the Euclidean case.
DRATS derives a minimax objective from a feasibility formulation of MTRL to adaptively sample tasks with the largest return gaps, leading to better worst-task performance on MetaWorld benchmarks.
LLMs copy biased analyst ratings in investment decisions but a new detection method encourages independent reasoning and can improve stock return predictions beyond human levels.
NoisyCausal benchmark tests LLMs on causal reasoning with structured noise, and a modular LLM-plus-causal-graph framework outperforms baselines while generalizing to Cladder.
o1-like models overthink easy tasks; self-training reduces compute use without accuracy loss on GSM8K, MATH500, GPQA, and AIME.
OPT-BENCH and OPT-Agent evaluate LLM self-optimization in large search spaces, showing stronger models improve via feedback but stay constrained by base capacity and below human performance.
XPERT extracts and reuses cross-domain expert knowledge from pre-trained MoE LLMs via inference analysis and tensor decomposition to improve performance and convergence in downstream language model training.
CAGE uses common-agency games and an EPEC algorithm to compute equilibrium policies that balance multiple conflicting objectives for test-time LLM alignment.
Response times modeled as drift-diffusion processes enable consistent estimation of population-average preferences from heterogeneous anonymous binary choices.
A queueing model derives stability conditions for LLM inference services under combined compute and KV cache memory limits, with experimental validation showing typical deviations under 10%.
AGoQ delivers up to 52% lower memory use and 1.34x faster training for 8B-32B LLaMA models by using near-4-bit adaptive activations and 8-bit gradients while preserving pretraining convergence and downstream accuracy.
LLMs prompted with few-shot examples and rationales generate better reasoned distractors for MCQs than fine-tuned contrastive models across six benchmarks.
Empirical analysis shows scaling inference compute via strategies like tree search can be more efficient than scaling model parameters, with 7B models plus novel search outperforming 34B models.
DPOP is a new loss function that prevents DPO from lowering preferred response likelihoods and outperforms standard DPO on diverse datasets, MT-Bench, and enables Smaug-72B to exceed 80% on the Open LLM Leaderboard.
Rennala MVR improves time complexity over Rennala SGD for smooth nonconvex stochastic optimization in heterogeneous parallel systems under a mean-squared smoothness assumption.
A reference-free proxy scoring framework combined with GIRB calibration produces better-aligned evaluation metrics for summarization and outperforms baselines across seven datasets.
SmolLM2 is a 1.7B-parameter language model that outperforms Qwen2.5-1.5B and Llama3.2-1B after overtraining on 11 trillion tokens using custom FineMath, Stack-Edu, and SmolTalk datasets in a multi-stage pipeline.
UnAC improves LMM performance on visual reasoning benchmarks by combining adaptive visual prompting, image abstraction, and gradual self-checking.
LLM serving requires mathematical optimization and algorithms with provable guarantees rather than generic heuristics that fail unpredictably on LLM workloads.
citing papers explorer
-
LOSCAR-SGD: Local SGD with Communication-Computation Overlap and Delay-Corrected Sparse Model Averaging
LOSCAR-SGD combines local updates, sparse model averaging, and communication-computation overlap with a delay-corrected merge rule, providing convergence rates for smooth non-convex objectives under worker heterogeneity.
-
Ringmaster LMO: Asynchronous Linear Minimization Oracle Momentum Method
Ringmaster LMO extends delay-thresholding from ASGD to LMO-based momentum updates, providing convergence guarantees under (L0, L1)-smoothness and time-complexity bounds that recover optimal rates in the Euclidean case.
-
Distributionally Robust Multi-Task Reinforcement Learning via Adaptive Task Sampling
DRATS derives a minimax objective from a feasibility formulation of MTRL to adaptively sample tasks with the largest return gaps, leading to better worst-task performance on MetaWorld benchmarks.
-
Fin-Bias: Comprehensive Evaluation for LLM Decision-Making under human bias in Finance Domain
LLMs copy biased analyst ratings in investment decisions but a new detection method encourages independent reasoning and can improve stock return predictions beyond human levels.
-
NoisyCausal: A Benchmark for Evaluating Causal Reasoning Under Structured Noise
NoisyCausal benchmark tests LLMs on causal reasoning with structured noise, and a modular LLM-plus-causal-graph framework outperforms baselines while generalizing to Cladder.
-
Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs
o1-like models overthink easy tasks; self-training reduces compute use without accuracy loss on GSM8K, MATH500, GPQA, and AIME.
-
OPT-BENCH: Evaluating the Iterative Self-Optimization of LLM Agents in Large-Scale Search Spaces
OPT-BENCH and OPT-Agent evaluate LLM self-optimization in large search spaces, showing stronger models improve via feedback but stay constrained by base capacity and below human performance.
-
XPERT: Expert Knowledge Transfer for Effective Training of Language Models
XPERT extracts and reuses cross-domain expert knowledge from pre-trained MoE LLMs via inference analysis and tensor decomposition to improve performance and convergence in downstream language model training.
-
Common-agency Games for Multi-Objective Test-Time Alignment
CAGE uses common-agency games and an EPEC algorithm to compute equilibrium policies that balance multiple conflicting objectives for test-time LLM alignment.
-
Response Time Enhances Alignment with Heterogeneous Preferences
Response times modeled as drift-diffusion processes enable consistent estimation of population-average preferences from heterogeneous anonymous binary choices.
-
A Queueing-Theoretic Framework for Stability Analysis of LLM Inference with KV Cache Memory Constraints
A queueing model derives stability conditions for LLM inference services under combined compute and KV cache memory limits, with experimental validation showing typical deviations under 10%.
-
AGoQ: Activation and Gradient Quantization for Memory-Efficient Distributed Training of LLMs
AGoQ delivers up to 52% lower memory use and 1.34x faster training for 8B-32B LLaMA models by using near-4-bit adaptive activations and 8-bit gradients while preserving pretraining convergence and downstream accuracy.
-
Beyond Fine-Tuning: In-Context Learning and Chain-of-Thought for Reasoned Distractor Generation
LLMs prompted with few-shot examples and rationales generate better reasoned distractors for MCQs than fine-tuned contrastive models across six benchmarks.
-
Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models
Empirical analysis shows scaling inference compute via strategies like tree search can be more efficient than scaling model parameters, with 7B models plus novel search outperforming 34B models.
-
Smaug: Fixing Failure Modes of Preference Optimisation with DPO-Positive
DPOP is a new loss function that prevents DPO from lowering preferred response likelihoods and outperforms standard DPO on diverse datasets, MT-Bench, and enables Smaug-72B to exceed 80% on the Open LLM Leaderboard.
-
Rennala MVR: Improved Time Complexity for Parallel Stochastic Optimization via Momentum-Based Variance Reduction
Rennala MVR improves time complexity over Rennala SGD for smooth nonconvex stochastic optimization in heterogeneous parallel systems under a mean-squared smoothness assumption.
-
Calibrating Model-Based Evaluation Metrics for Summarization
A reference-free proxy scoring framework combined with GIRB calibration produces better-aligned evaluation metrics for summarization and outperforms baselines across seven datasets.
-
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
SmolLM2 is a 1.7B-parameter language model that outperforms Qwen2.5-1.5B and Llama3.2-1B after overtraining on 11 trillion tokens using custom FineMath, Stack-Edu, and SmolTalk datasets in a multi-stage pipeline.
-
UnAC: Adaptive Visual Prompting with Abstraction and Stepwise Checking for Complex Multimodal Reasoning
UnAC improves LMM performance on visual reasoning benchmarks by combining adaptive visual prompting, image abstraction, and gradual self-checking.
-
Position: LLM Serving Needs Mathematical Optimization and Algorithmic Foundations, Not Just Heuristics
LLM serving requires mathematical optimization and algorithms with provable guarantees rather than generic heuristics that fail unpredictably on LLM workloads.