LOSCAR-SGD combines local updates, sparse model averaging, and communication-computation overlap with a delay-corrected merge rule, providing convergence rates for smooth non-convex objectives under worker heterogeneity.
Title resolution pending
16 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
Ringmaster LMO extends delay-thresholding from ASGD to LMO-based momentum updates, providing convergence guarantees under (L0, L1)-smoothness and time-complexity bounds that recover optimal rates in the Euclidean case.
Iterative self-rewarding via LLM-as-Judge in DPO training on Llama 2 70B improves instruction following and self-evaluation, outperforming GPT-4 on AlpacaEval 2.0.
STELLAR trains up to 500M-parameter multi-modal models on 50M driving scenes and reports empirical scaling trends plus new state-of-the-art results on the Waymo Open Dataset.
GoLongRL releases a 23K-sample open long-context RL dataset spanning 9 tasks and introduces TMN-Reweight to improve multitask optimization, achieving performance comparable to much larger models under GRPO.
DisAAD trains a 1%-sized proxy model via adversarial distillation to quantify uncertainty in black-box LLMs by aligning with their output distributions.
A queueing model derives stability conditions for LLM inference services under combined compute and KV cache memory limits, with experimental validation showing typical deviations under 10%.
S2H-DPO generates hierarchical prompt-driven preference pairs to improve multi-image reasoning in VLMs while keeping single-image performance intact.
A multi-agent AI system generates novel biomedical hypotheses that show promising experimental validation in drug repurposing for leukemia, new targets for liver fibrosis, and a bacterial gene transfer mechanism.
Med-Gemini sets new records on 10 of 14 medical benchmarks including 91.1% on MedQA-USMLE, beats GPT-4V by 44.5% on multimodal tasks, and surpasses humans on medical text summarization.
Rennala MVR improves time complexity over Rennala SGD for smooth nonconvex stochastic optimization in heterogeneous parallel systems under a mean-squared smoothness assumption.
TabGRAA applies group-relative advantage alignment in an iterative reward-guided post-training loop to improve tabular language model generators on fidelity, utility, and privacy trade-offs across five benchmarks.
MAPLE uses meta-learning with prototypical networks to learn transferable representations and achieves state-of-the-art cross-prompt essay scoring on ELLIPSE, LAILA, and parts of ASAP datasets.
The work introduces WaLeF/FIDLAr for flood forecasting, CoDiCast for probabilistic weather, and Hypercube-RAG for explainable environmental QA, claiming superior accuracy, efficiency, and interpretability over baselines.
LLM serving requires mathematical optimization and algorithms with provable guarantees rather than generic heuristics that fail unpredictably on LLM workloads.
Combining English and target-language web retrieval boosts medical QA for low-resource languages to match high-resource performance, while English web data benefits high-resource languages most and specialized sources like PubMed lack multilingual coverage.
citing papers explorer
-
Towards an AI co-scientist
A multi-agent AI system generates novel biomedical hypotheses that show promising experimental validation in drug repurposing for leukemia, new targets for liver fibrosis, and a bacterial gene transfer mechanism.
-
Capabilities of Gemini Models in Medicine
Med-Gemini sets new records on 10 of 14 medical benchmarks including 91.1% on MedQA-USMLE, beats GPT-4V by 44.5% on multimodal tasks, and surpasses humans on medical text summarization.