{"total":13,"items":[{"citing_arxiv_id":"2604.22452","ref_index":1,"ref_count":1,"confidence":0.35,"is_internal_anchor":false,"paper_title":"Superminds Test: Actively Evaluating Collective Intelligence of Agent Society via Probing Agents","primary_cat":"cs.AI","submitted_at":"2026-04-24T11:11:36+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Large-scale experiments on two million agents reveal that collective intelligence does not emerge from scale alone due to sparse and shallow interactions.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.09791","ref_index":1,"ref_count":1,"confidence":0.35,"is_internal_anchor":false,"paper_title":"Pioneer Agent: Continual Improvement of Small Language Models in Production","primary_cat":"cs.AI","submitted_at":"2026-04-10T18:13:09+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Pioneer Agent automates the full lifecycle of adapting and continually improving small language models via diagnosis-driven data synthesis and regression-constrained retraining, delivering gains of 1.6-83.8 points on benchmarks and large lifts in production-style tasks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.08706","ref_index":1,"ref_count":1,"confidence":0.35,"is_internal_anchor":false,"paper_title":"Efficient RL Training for LLMs with Experience Replay","primary_cat":"cs.LG","submitted_at":"2026-04-09T18:56:12+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Well-designed experience replay buffers reduce inference compute in LLM RL post-training while maintaining or improving performance and preserving policy entropy.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.03208","ref_index":1,"ref_count":1,"confidence":0.35,"is_internal_anchor":false,"paper_title":"Hierarchical Planning with Latent World Models","primary_cat":"cs.LG","submitted_at":"2026-04-03T17:32:36+00:00","verdict":null,"verdict_confidence":null,"novelty_score":null,"formal_verification":null,"one_line_summary":null,"context_count":1,"top_context_role":"other","top_context_polarity":"unclear","context_text":"actions between waypoint states into latent macro-actions. This reduces the dimensionality of the high-level search space and makes planning more tractable. By decoupling long-horizon reasoning from fine-grained control, HWM mitigates error accumulation and reduces overall planning complexity. 2 z1 E s1 P(2) lt1 ˆzt2 P(2) lt2 P(2) ltH−1 ˆzH · · · · · P(1) a1 P(1) ah ˆzh . . . zgoal E sgoal Llow Lhigh High-Level Planning to Goal arg min{l}Lhigh Low-Level Planning to1 st Subgoal arg min{a}Llow Model key P(1) Low-level world modelP(2) High-level world model Figure 2 Hierarchical planning in latent space.A high-level planner optimizes macro actions using a long-horizon latent world model to reach the final goal embedding."},{"citing_arxiv_id":"2604.03021","ref_index":1,"ref_count":1,"confidence":0.35,"is_internal_anchor":false,"paper_title":"Temporal structure of the language hierarchy within small cortical patches","primary_cat":"q-bio.NC","submitted_at":"2026-04-03T13:13:22+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Small cortical patches multiplex phonetic, syllabic, and lexical representations during speech production via dynamic temporal coding.","context_count":1,"top_context_role":"background","top_context_polarity":"unclear","context_text":"Alex Graves, Abdel-rahman Mohamed, and Geoffrey Hinton. Speech recognition with deep recurrent neural networks. In2013 IEEE international conference on acoustics, speech and signal processing, pages 6645-6649. Ieee, 2013. Laura Gwilliams, Jean-Remi King, Alec Marantz, and David Poeppel. Neural dynamics of phoneme sequences reveal position-invariant code for content and order.Nature communications, 13(1):6606, 2022. Laura Gwilliams, Alec Marantz, David Poeppel, and Jean-Remi King. Hierarchical dynamic coding coordinates speech comprehension in the human brain. biorxiv, 2025. Matthew Honnibal, Ines Montani, Sofie Van Landeghem, Adriane Boyd, et al. spacy: Industrial-strength natural language processing in python. 2020. Jennifer Hu, Hannah Small, Hope Kean, Atsushi Takahashi, Leo Zekelman, Daniel Kleinman, Elizabeth Ryan, Alfonso"},{"citing_arxiv_id":"2604.02688","ref_index":36,"ref_count":2,"confidence":0.35,"is_internal_anchor":false,"paper_title":"MatClaw: An Autonomous Code-First LLM Agent for End-to-End Materials Exploration","primary_cat":"cond-mat.mtrl-sci","submitted_at":"2026-04-03T03:32:15+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"MatClaw shows a code-first LLM agent autonomously generating and executing workflows for ML force field training, Curie temperature prediction, and parameter search on CuInP2S6, succeeding on code but requiring interventions for tacit domain knowledge.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.01348","ref_index":1,"ref_count":1,"confidence":0.35,"is_internal_anchor":false,"paper_title":"Procedural Knowledge at Scale Improves Reasoning","primary_cat":"cs.CL","submitted_at":"2026-04-01T20:01:47+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Reasoning Memory decomposes reasoning trajectories into 32 million subquestion-subroutine pairs and retrieves them via in-thought prompts to improve language model performance on math, science, and coding benchmarks by up to 19.2%.","context_count":1,"top_context_role":"other","top_context_polarity":"unclear","context_text":"Sample FilteringWe perform additional filtering over the m samples collected from top-k retrievals. For πj,l, the samplel of thejth retrieved hint, we calculate the raw scorerj,l as |πj,l|, the length of the trajectory in terms of tokens. Let rmax and rmin be the maximum and minimum scores across all samples for the same question, we normalize the score to get a quality score for each sample asˆrj,l = (rmax −r j,l)/(rmax −r min) ∈ [0, 1]. We 6Accessed athttps://huggingface.co/reasonir/ReasonIR-8B 17 Original Question: A museum prints ticket codes in an unknown baseb≥ 6. One ticket reads2A5b, where A is a single digit. When interpreted as a base-10 integer, the code must (1) be the product of two distinct primes and (2) satisfy the checksumN≡ 3 ( mod 11). Determine all pairs(b, A)that produce a valid ticket."},{"citing_arxiv_id":"2509.21267","ref_index":1,"ref_count":1,"confidence":0.35,"is_internal_anchor":false,"paper_title":"Task-Dependent Evaluation of LLM Output Homogenization: A Taxonomy-Guided Framework","primary_cat":"cs.CL","submitted_at":"2025-09-25T14:58:07+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Proposes a task taxonomy for functional diversity in LLM outputs, validates it via user study, introduces targeted sampling to boost diversity only where needed, and presents evidence that the diversity-quality tradeoff may be an artifact of task-agnostic measurement.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2509.18095","ref_index":1,"ref_count":1,"confidence":0.35,"is_internal_anchor":false,"paper_title":"MetaEmbed: Scaling Multimodal Retrieval at Test-Time with Flexible Late Interaction","primary_cat":"cs.IR","submitted_at":"2025-09-22T17:59:42+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"MetaEmbed trains fixed learnable Meta Tokens to produce granularity-organized multi-vector embeddings that support test-time scaling in multimodal retrieval.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2509.02376","ref_index":76,"ref_count":1,"confidence":0.35,"is_internal_anchor":false,"paper_title":"Resampling-based multi-resolution false discovery exceedance control","primary_cat":"stat.ME","submitted_at":"2025-09-02T14:43:24+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"A resampling-based extension of maxT that delivers simultaneous FDX control over all thresholds and enables data-dependent confidence envelopes for the first time.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2507.19247","ref_index":37,"ref_count":1,"confidence":0.35,"is_internal_anchor":false,"paper_title":"A Markov Categorical Framework for Language Modeling","primary_cat":"cs.LG","submitted_at":"2025-07-25T13:14:03+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"A Markov category framework for language models provides an information-theoretic rationale for speculative decoding and shows that a quadratic surrogate to negative log-likelihood induces generalized CCA alignment in linear-softmax heads after normalization.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2406.12500","ref_index":3,"ref_count":1,"confidence":0.35,"is_internal_anchor":false,"paper_title":"$C^1$-robust homoclinic tangencies","primary_cat":"math.DS","submitted_at":"2024-06-18T11:02:38+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Blenders constructed via C^r-small perturbations of heterodimensional cycles generate C^1-robust tangencies, and homoclinic tangency unfolding produces uncountably many robust examples under the stated conditions, answering Bonatti-Díaz.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2303.05307","ref_index":32,"ref_count":1,"confidence":0.35,"is_internal_anchor":false,"paper_title":"Learning Strategic Value and Cooperation in Multi-Player Stochastic Games through Side Payments","primary_cat":"cs.GT","submitted_at":"2023-03-09T14:57:15+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Introduces HS-S (aggregating dynamic threat powers) and Coco-S (fixed points of statewise HS Bellman operator) for stochastic games, proves they coincide for two players but disagree for three, shows uniqueness via extended axioms and topological degree theory, and gives sampling estimators.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null}],"limit":50,"offset":0}