Gonzalez and Ion Stoica , booktitle=

Lianmin Zheng, Wei-Lin Chiang, Ying Sheng, Siyuan Zhuang, Zhanghao Wu, Yonghao Zhuang

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

browse 5 citing papers

representative citing papers

Learning How and What to Memorize: Cognition-Inspired Two-Stage Optimization for Evolving Memory

cs.CL · 2026-05-01 · unverdicted · novelty 7.0

MemCoE learns memory organization guidelines via contrastive feedback and then trains a guideline-aligned RL policy for memory updates, yielding consistent gains on personalization benchmarks.

Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs

cs.CL · 2024-12-30 · unverdicted · novelty 7.0

o1-like models overthink easy tasks; self-training reduces compute use without accuracy loss on GSM8K, MATH500, GPQA, and AIME.

Power Distribution Bridges Sampling, Self-Reward RL, and Self-Distillation

cs.LG · 2026-05-06 · unverdicted · novelty 6.0

The power distribution is the target of power sampling, the closed-form solution to self-reward KL-regularized RL, and the basis for power self-distillation that matches sampling performance at lower cost.

DoRA: Weight-Decomposed Low-Rank Adaptation

cs.CL · 2024-02-14 · accept · novelty 6.0

DoRA improves LoRA by decomposing weights into magnitude and direction and updating only direction with low-rank matrices, closing much of the gap to full fine-tuning.

Multimodal Hidden Markov Models for Persistent Emotional State Tracking

cs.AI · 2026-05-13 · unverdicted · novelty 5.0

Sticky factorial HDP-HMMs applied to multimodal valence-arousal trajectories identify interpretable persistent emotional regimes in conversations, outperforming Gaussian HMM baselines in consistency metrics and enabling context-augmented LLM responses.

citing papers explorer

Showing 5 of 5 citing papers.

Learning How and What to Memorize: Cognition-Inspired Two-Stage Optimization for Evolving Memory cs.CL · 2026-05-01 · unverdicted · none · ref 55
MemCoE learns memory organization guidelines via contrastive feedback and then trains a guideline-aligned RL policy for memory updates, yielding consistent gains on personalization benchmarks.
Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs cs.CL · 2024-12-30 · unverdicted · none · ref 261
o1-like models overthink easy tasks; self-training reduces compute use without accuracy loss on GSM8K, MATH500, GPQA, and AIME.
Power Distribution Bridges Sampling, Self-Reward RL, and Self-Distillation cs.LG · 2026-05-06 · unverdicted · none · ref 94
The power distribution is the target of power sampling, the closed-form solution to self-reward KL-regularized RL, and the basis for power self-distillation that matches sampling performance at lower cost.
DoRA: Weight-Decomposed Low-Rank Adaptation cs.CL · 2024-02-14 · accept · none · ref 40
DoRA improves LoRA by decomposing weights into magnitude and direction and updating only direction with low-rank matrices, closing much of the gap to full fine-tuning.
Multimodal Hidden Markov Models for Persistent Emotional State Tracking cs.AI · 2026-05-13 · unverdicted · none · ref 17
Sticky factorial HDP-HMMs applied to multimodal valence-arousal trajectories identify interpretable persistent emotional regimes in conversations, outperforming Gaussian HMM baselines in consistency metrics and enabling context-augmented LLM responses.

Gonzalez and Ion Stoica , booktitle=

fields

years

verdicts

representative citing papers

citing papers explorer