DeepEyesV2 uses a two-stage cold-start plus reinforcement learning pipeline to produce an agentic multimodal model that adaptively invokes tools and outperforms direct RL on real-world reasoning benchmarks.
arXiv preprint arXiv:2502.04567 , year=
2 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 2representative citing papers
HawkesLLM pairs a multivariate Hawkes process with language models to model temporal influence cascades in agentic text simulation and reports improved late-stage semantic alignment on a GDELT news case study under limited memory.
citing papers explorer
-
DeepEyesV2: Toward Agentic Multimodal Model
DeepEyesV2 uses a two-stage cold-start plus reinforcement learning pipeline to produce an agentic multimodal model that adaptively invokes tools and outperforms direct RL on real-world reasoning benchmarks.
-
HawkesLLM: Semantic Uncertainty Propagation in Agentic Text Simulation
HawkesLLM pairs a multivariate Hawkes process with language models to model temporal influence cascades in agentic text simulation and reports improved late-stage semantic alignment on a GDELT news case study under limited memory.