GoLongRL releases a 23K-sample open long-context RL dataset spanning 9 tasks and introduces TMN-Reweight to improve multitask optimization, achieving performance comparable to much larger models under GRPO.
Title resolution pending
7 Pith papers cite this work. Polarity classification is still indexing.
years
2026 7representative citing papers
PaSaMaster is a self-evolving agentic literature retrieval system that improves F1-score by 15.6X over keyword search and outperforms GPT-5.2 by 30% at 1% cost with zero source hallucination across 38 disciplines.
SP-KV trains a utility predictor jointly with the LLM to dynamically prune low-utility KV cache entries, achieving 3-10x memory reduction during generation with negligible performance loss.
An agent system autoformalizes industry DRAM specifications into DRAMPyML for verification tasks like assertion generation, with DRAMBench dataset released for benchmarking.
UserGPT introduces a generative LLM framework with a behavior simulation engine, semantization module, and DF-GRPO post-training that scores 0.7325 on tag prediction and 0.7528 on summary generation on HPR-Bench while compressing records by up to 97.9%.
SciAtlas builds a large-scale multi-disciplinary academic knowledge graph and a neuro-symbolic retrieval system to support automated scientific research tasks such as literature review and idea positioning.
citing papers explorer
-
GoLongRL: Capability-Oriented Long Context Reinforcement Learning with Multitask Alignment
GoLongRL releases a 23K-sample open long-context RL dataset spanning 9 tasks and introduces TMN-Reweight to improve multitask optimization, achieving performance comparable to much larger models under GRPO.
-
Towards Self-Evolving Agentic Literature Retrieval
PaSaMaster is a self-evolving agentic literature retrieval system that improves F1-score by 15.6X over keyword search and outperforms GPT-5.2 by 30% at 1% cost with zero source hallucination across 38 disciplines.
-
Self-Pruned Key-Value Attention: Learning When to Write by Predicting Future Utility
SP-KV trains a utility predictor jointly with the LLM to dynamically prune low-utility KV cache entries, achieving 3-10x memory reduction during generation with negligible performance loss.
-
Autoformalizing Memory Specifications with Agents
An agent system autoformalizes industry DRAM specifications into DRAMPyML for verification tasks like assertion generation, with DRAMBench dataset released for benchmarking.
-
UserGPT Technical Report
UserGPT introduces a generative LLM framework with a behavior simulation engine, semantization module, and DF-GRPO post-training that scores 0.7325 on tag prediction and 0.7528 on summary generation on HPR-Bench while compressing records by up to 97.9%.
-
SciAtlas: A Large-Scale Knowledge Graph for Automated Scientific Research
SciAtlas builds a large-scale multi-disciplinary academic knowledge graph and a neuro-symbolic retrieval system to support automated scientific research tasks such as literature review and idea positioning.
- Hy-MT2: A Family of Fast, Efficient and Powerful Multilingual Translation Models in the Wild