pith. sign in

hub

End-to-end test-time training for long context

14 Pith papers cite this work. Polarity classification is still indexing.

14 Pith papers citing it

hub tools

citation-role summary

background 3

citation-polarity summary

years

2026 14

roles

background 3

polarities

background 3

clear filters

representative citing papers

MemDLM: Memory-Enhanced DLM Training

cs.CL · 2026-03-23 · unverdicted · novelty 7.0

MemDLM embeds a simulated denoising trajectory into DLM training via bi-level optimization, creating a parametric memory that improves convergence and long-context performance even when the memory is dropped at test time.

Learning to Discover at Test Time

cs.LG · 2026-01-22 · unverdicted · novelty 7.0

TTT-Discover applies test-time RL to set new state-of-the-art results on math inequalities, GPU kernels, algorithm contests, and single-cell denoising using an open model and public code.

Scaling Self-Evolving Agents via Parametric Memory

cs.AI · 2026-06-03 · unverdicted · novelty 6.0

TMEM lets LLM agents evolve their policy mid-episode by absorbing distilled supervision into online LoRA updates, outperforming summary and retrieval baselines on several long-context benchmarks.

FocuSFT: Bilevel Optimization for Dilution-Aware Long-Context Fine-Tuning

cs.CL · 2026-05-11 · unverdicted · novelty 6.0

FocuSFT uses an inner optimization loop to adapt fast-weight parameters into a parametric memory that sharpens attention on relevant content, then conditions outer-loop supervised fine-tuning on this representation, yielding gains on long-context benchmarks.

Fast Spatial Memory with Elastic Test-Time Training

cs.CV · 2026-04-08 · unverdicted · novelty 6.0

Elastic Test-Time Training stabilizes test-time updates via an elastic prior and moving-average anchor, enabling Fast Spatial Memory for scalable long-sequence 4D reconstruction with reduced memory use and fewer shortcuts.

PACEvolve++: Improving Test-time Learning for Evolutionary Search Agents

cs.LG · 2026-05-07 · unverdicted · novelty 5.0

PACEvolve++ uses a phase-adaptive reinforcement learning advisor to decouple hypothesis selection from execution in LLM-driven evolutionary search, delivering faster convergence than prior frameworks on load balancing, recommendation, and protein tasks.

Bilevel learning

math.OC · 2026-05-02 · unverdicted · novelty 2.0

Bilevel learning methods rely on implicit differentiation but are restricted by assumptions of unique lower-level solutions and struggle with constraints, and connections to broader bilevel optimization literature may enable more scalable general-purpose algorithms.

citing papers explorer

Showing 13 of 13 citing papers after filters.

  • Continual Learning Bench: Evaluating Frontier AI Systems in Real-World Stateful Environments cs.AI · 2026-06-04 · unverdicted · none · ref 31

    CL-Bench is the first expert-validated benchmark for continual learning in frontier LLMs across six real-world domains, showing limited gains and that naive in-context learning outperforms dedicated memory systems.

  • Do Language Models Need Sleep? Offline Recurrence for Improved Online Inference cs.CL · 2026-05-25 · unverdicted · none · ref 55

    A sleep mechanism with N offline recurrent passes consolidates context into fast weights, improving performance on reasoning tasks where standard transformers fail.

  • MemDLM: Memory-Enhanced DLM Training cs.CL · 2026-03-23 · unverdicted · none · ref 57

    MemDLM embeds a simulated denoising trajectory into DLM training via bi-level optimization, creating a parametric memory that improves convergence and long-context performance even when the memory is dropped at test time.

  • Learning to Discover at Test Time cs.LG · 2026-01-22 · unverdicted · none · ref 71

    TTT-Discover applies test-time RL to set new state-of-the-art results on math inequalities, GPU kernels, algorithm contests, and single-cell denoising using an open model and public code.

  • T3R: Deeper Test-Time Adaptation for Graph Neural Networks via Gradient Rotation cs.LG · 2026-06-29 · unverdicted · none · ref 15

    T3R applies multiple Rotograd matrices and a rotation technique to create surrogate gradients, enabling deeper test-time adaptation in GNNs and yielding 0.172 MAE reduction plus 9.37% relative gains on OGB benchmarks.

  • Scaling Self-Evolving Agents via Parametric Memory cs.AI · 2026-06-03 · unverdicted · none · ref 14

    TMEM lets LLM agents evolve their policy mid-episode by absorbing distilled supervision into online LoRA updates, outperforming summary and retrieval baselines on several long-context benchmarks.

  • AgentOdyssey: Open-Ended Long-Horizon Text Game Generation for Test-Time Continual Learning Agents cs.CL · 2026-05-29 · unverdicted · none · ref 27

    Introduces AgentOdyssey, a procedural generator of open-ended long-horizon text games, to evaluate test-time continual learning agents and diagnose limits in exploration, memory, and planning.

  • In-context learning to predict critical transitions in dynamical systems cs.LG · 2026-05-12 · unverdicted · none · ref 50

    TipPFN uses prior-data fitted networks and in-context learning on synthetic bifurcation data to detect proximity to critical transitions in unseen dynamical systems and real observations.

  • FocuSFT: Bilevel Optimization for Dilution-Aware Long-Context Fine-Tuning cs.CL · 2026-05-11 · unverdicted · none · ref 32

    FocuSFT uses an inner optimization loop to adapt fast-weight parameters into a parametric memory that sharpens attention on relevant content, then conditions outer-loop supervised fine-tuning on this representation, yielding gains on long-context benchmarks.

  • Fast Spatial Memory with Elastic Test-Time Training cs.CV · 2026-04-08 · unverdicted · none · ref 45

    Elastic Test-Time Training stabilizes test-time updates via an elastic prior and moving-average anchor, enabling Fast Spatial Memory for scalable long-sequence 4D reconstruction with reduced memory use and fewer shortcuts.

  • IndexMem: Learned KV-Cache Eviction with Latent Memory for Long-Context LLM Inference cs.CL · 2026-05-25 · unverdicted · none · ref 23

    IndexMem proposes a learned KV importance predictor paired with a latent memory module to enable bounded KV cache size for long-context inference, reporting gains on RULER, Needle-in-a-Haystack, and LongBench across multiple LLMs.

  • PACEvolve++: Improving Test-time Learning for Evolutionary Search Agents cs.LG · 2026-05-07 · unverdicted · none · ref 38

    PACEvolve++ uses a phase-adaptive reinforcement learning advisor to decouple hypothesis selection from execution in LLM-driven evolutionary search, delivering faster convergence than prior frameworks on load balancing, recommendation, and protein tasks.

  • Bilevel learning math.OC · 2026-05-02 · unverdicted · none · ref 28

    Bilevel learning methods rely on implicit differentiation but are restricted by assumptions of unique lower-level solutions and struggle with constraints, and connections to broader bilevel optimization literature may enable more scalable general-purpose algorithms.