arXiv preprint arXiv:2411.08671 , year=

Group-in-group policy optimization for LLM agent training · 1994 · arXiv 2411.08671

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

A Multi-head-based architecture for effective morphological tagging in Russian with open dictionary

cs.CL · 2026-04-03 · unverdicted · novelty 6.0

A multi-head attention model for Russian morphological tagging supports open dictionaries via subtoken splitting and reports 98-99% accuracy on grammatical categories while running efficiently on consumer hardware.

Skill Reuse as Compression in Agentic RL

cs.LG · 2026-05-29 · unverdicted · novelty 5.0

ReuseRL augments agentic RL with an MDL-based compression penalty on skill reuse, proves a PAC-Bayes bound, and reports higher in- and out-of-distribution success on ALFWorld, TextWorld-Cooking, and Countdown-Stepwise versus GRPO and round-length baselines.

Understanding Secret Leakage Risks in Code LLMs: A Tokenization Perspective

cs.CR · 2026-04-20 · unverdicted · novelty 5.0

BPE tokenization creates gibberish bias in CLLMs, causing secrets with high character entropy but low token entropy to be preferentially memorized due to training data distribution shifts.

citing papers explorer

Showing 3 of 3 citing papers.

A Multi-head-based architecture for effective morphological tagging in Russian with open dictionary cs.CL · 2026-04-03 · unverdicted · none · ref 13
A multi-head attention model for Russian morphological tagging supports open dictionaries via subtoken splitting and reports 98-99% accuracy on grammatical categories while running efficiently on consumer hardware.
Skill Reuse as Compression in Agentic RL cs.LG · 2026-05-29 · unverdicted · none · ref 2
ReuseRL augments agentic RL with an MDL-based compression penalty on skill reuse, proves a PAC-Bayes bound, and reports higher in- and out-of-distribution success on ALFWorld, TextWorld-Cooking, and Countdown-Stepwise versus GRPO and round-length baselines.
Understanding Secret Leakage Risks in Code LLMs: A Tokenization Perspective cs.CR · 2026-04-20 · unverdicted · none · ref 60
BPE tokenization creates gibberish bias in CLLMs, causing secrets with high character entropy but low token entropy to be preferentially memorized due to training data distribution shifts.

arXiv preprint arXiv:2411.08671 , year=

fields

years

verdicts

representative citing papers

citing papers explorer