arXiv preprint arXiv:2509.21164 , year=

Mixture of thoughts: Learning to aggregate what experts think, not just what they say , author= · 2025 · arXiv 2509.21164

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

representative citing papers

See What I See, Know What I Think: Dense Latent Communication Across Heterogeneous Agents

cs.MA · 2026-06-11 · unverdicted · novelty 7.0

Heterogeneous agents achieve dense latent KV-cache communication via lightweight cross-model transformation and two-phase training, outperforming text at lower compute in context-aware settings and enabling context-unaware transfer.

HumorGen: Cognitive Synergy for Humor Generation in Large Language Models via Persona-Based Distillation

cs.CL · 2026-03-19 · conditional · novelty 6.0

Persona-based Mixture-of-Thought data curation lets a 7B student outperform larger models on humor generation, while DPO and O-GRPO add no gain over SFT.

When Do LLMs Reason? A Dynamical Systems View via Entropy Phase Transitions

cs.LG · 2026-05-20 · unverdicted · novelty 5.0

Early entropy dynamics during LLM decoding mark when explicit reasoning becomes beneficial, enabling the training-free EDRM router that selects strategies per instance and yields 41-55% token savings with accuracy gains across 15 benchmarks.

Beyond tokens: a unified framework for latent communication in LLM-based multi-agent systems

cs.CL · 2026-06-04

citing papers explorer

Showing 4 of 4 citing papers.

See What I See, Know What I Think: Dense Latent Communication Across Heterogeneous Agents cs.MA · 2026-06-11 · unverdicted · none · ref 17
Heterogeneous agents achieve dense latent KV-cache communication via lightweight cross-model transformation and two-phase training, outperforming text at lower compute in context-aware settings and enabling context-unaware transfer.
HumorGen: Cognitive Synergy for Humor Generation in Large Language Models via Persona-Based Distillation cs.CL · 2026-03-19 · conditional · none · ref 9
Persona-based Mixture-of-Thought data curation lets a 7B student outperform larger models on humor generation, while DPO and O-GRPO add no gain over SFT.
When Do LLMs Reason? A Dynamical Systems View via Entropy Phase Transitions cs.LG · 2026-05-20 · unverdicted · none · ref 34
Early entropy dynamics during LLM decoding mark when explicit reasoning becomes beneficial, enabling the training-free EDRM router that selects strategies per instance and yields 41-55% token savings with accuracy gains across 15 benchmarks.
Beyond tokens: a unified framework for latent communication in LLM-based multi-agent systems cs.CL · 2026-06-04 · unreviewed · ref 9

arXiv preprint arXiv:2509.21164 , year=

fields

years

verdicts

representative citing papers

citing papers explorer