Online model distillation for efficient video inference

Ravi Teja Mullapudi, Steven Chen, Keyi Zhang, Deva Ramanan, Kayvon Fatahalian · 2018 · arXiv 1812.02699

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Learning to (Learn at Test Time): RNNs with Expressive Hidden States

cs.LG · 2024-07-05 · conditional · novelty 8.0

TTT layers treat the hidden state as a trainable model updated at test time, allowing linear-complexity sequence models to scale perplexity reduction with context length unlike Mamba.

Learning to Discover at Test Time

cs.LG · 2026-01-22 · unverdicted · novelty 7.0

TTT-Discover applies test-time RL to set new state-of-the-art results on math inequalities, GPU kernels, algorithm contests, and single-cell denoising using an open model and public code.

Forget, Anticipate and Adapt: Test Time Training for Long Videos

cs.CV · 2026-06-25 · unverdicted · novelty 6.0

FFN performs TTT on multi-hour videos by restricting updates to three frames and using a surprise metric for adaptive window sizing, plus a new EpicTours dataset.

citing papers explorer

Showing 3 of 3 citing papers.

Learning to (Learn at Test Time): RNNs with Expressive Hidden States cs.LG · 2024-07-05 · conditional · none · ref 56
TTT layers treat the hidden state as a trainable model updated at test time, allowing linear-complexity sequence models to scale perplexity reduction with context length unlike Mamba.
Learning to Discover at Test Time cs.LG · 2026-01-22 · unverdicted · none · ref 49
TTT-Discover applies test-time RL to set new state-of-the-art results on math inequalities, GPU kernels, algorithm contests, and single-cell denoising using an open model and public code.
Forget, Anticipate and Adapt: Test Time Training for Long Videos cs.CV · 2026-06-25 · unverdicted · none · ref 3
FFN performs TTT on multi-hour videos by restricting updates to three frames and using a surprise metric for adaptive window sizing, plus a new EpicTours dataset.

Online model distillation for efficient video inference

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer