A multi-head attention model for Russian morphological tagging supports open dictionaries via subtoken splitting and reports 98-99% accuracy on grammatical categories while running efficiently on consumer hardware.
arXiv preprint arXiv:2411.08671 , year=
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
ReuseRL augments agentic RL with an MDL-based compression penalty on skill reuse, proves a PAC-Bayes bound, and reports higher in- and out-of-distribution success on ALFWorld, TextWorld-Cooking, and Countdown-Stepwise versus GRPO and round-length baselines.
BPE tokenization creates gibberish bias in CLLMs, causing secrets with high character entropy but low token entropy to be preferentially memorized due to training data distribution shifts.
citing papers explorer
-
A Multi-head-based architecture for effective morphological tagging in Russian with open dictionary
A multi-head attention model for Russian morphological tagging supports open dictionaries via subtoken splitting and reports 98-99% accuracy on grammatical categories while running efficiently on consumer hardware.
-
Skill Reuse as Compression in Agentic RL
ReuseRL augments agentic RL with an MDL-based compression penalty on skill reuse, proves a PAC-Bayes bound, and reports higher in- and out-of-distribution success on ALFWorld, TextWorld-Cooking, and Countdown-Stepwise versus GRPO and round-length baselines.
-
Understanding Secret Leakage Risks in Code LLMs: A Tokenization Perspective
BPE tokenization creates gibberish bias in CLLMs, causing secrets with high character entropy but low token entropy to be preferentially memorized due to training data distribution shifts.