A diameter criterion tied to a potential function certifies convergence of difference inclusions, enabling discrete proofs for first-order optimization methods with diminishing steps.
Title resolution pending
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4verdicts
UNVERDICTED 4representative citing papers
MAS-PNCG accelerates IPC by incrementally updating multilevel MAS preconditioners via Sparse-Input Woodbury, adding Hessian-aware 2D subspace minimization and per-subdomain CCD, achieving up to 5.66x speedup over Newton-PCG baselines.
AGoQ delivers up to 52% lower memory use and 1.34x faster training for 8B-32B LLaMA models by using near-4-bit adaptive activations and 8-bit gradients while preserving pretraining convergence and downstream accuracy.
Recurrent RL policies can have their hidden states aligned with PMP co-states through a derived loss, yielding robust performance on partially observable control tasks.
citing papers explorer
-
Convergence of difference inclusions via a diameter criterion
A diameter criterion tied to a potential function certifies convergence of difference inclusions, enabling discrete proofs for first-order optimization methods with diminishing steps.
-
An Efficient Multilevel Preconditioned Nonlinear Conjugate Gradient Method for Incremental Potential Contact
MAS-PNCG accelerates IPC by incrementally updating multilevel MAS preconditioners via Sparse-Input Woodbury, adding Hessian-aware 2D subspace minimization and per-subdomain CCD, achieving up to 5.66x speedup over Newton-PCG baselines.
-
AGoQ: Activation and Gradient Quantization for Memory-Efficient Distributed Training of LLMs
AGoQ delivers up to 52% lower memory use and 1.34x faster training for 8B-32B LLaMA models by using near-4-bit adaptive activations and 8-bit gradients while preserving pretraining convergence and downstream accuracy.
-
Neural Co-state Policies: Structuring Hidden States in Recurrent Reinforcement Learning
Recurrent RL policies can have their hidden states aligned with PMP co-states through a derived loss, yielding robust performance on partially observable control tasks.