Title resolution pending

· 2019

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

browse 3 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Effective Context in Transformers: An Analysis of Fragmentation and Tokenization

cs.LG · 2026-05-13 · unverdicted · novelty 8.0

Fragmentation strictly raises optimal finite-context log-loss on Markov sources while tokenization can make a short token window equivalent to a longer source window under reliability and compression conditions.

Memory Inception: Latent-Space KV Cache Manipulation for Steering LLMs

cs.LG · 2026-05-07 · unverdicted · novelty 6.0 · 2 refs

Memory Inception is a training-free method that injects latent KV banks at chosen layers to steer LLMs, achieving superior control-drift balance and up to 118x storage reduction on personality and structured-reasoning tasks.

A Study on Hidden Layer Distillation for Large Language Model Pre-Training

cs.CL · 2026-05-12 · unverdicted · novelty 5.0

Hidden layer distillation yields systematic perplexity gains over logit KD in LLM pre-training but does not consistently improve downstream performance.

citing papers explorer

Showing 3 of 3 citing papers.

Effective Context in Transformers: An Analysis of Fragmentation and Tokenization cs.LG · 2026-05-13 · unverdicted · none · ref 5
Fragmentation strictly raises optimal finite-context log-loss on Markov sources while tokenization can make a short token window equivalent to a longer source window under reliability and compression conditions.
Memory Inception: Latent-Space KV Cache Manipulation for Steering LLMs cs.LG · 2026-05-07 · unverdicted · none · ref 3 · 2 links
Memory Inception is a training-free method that injects latent KV banks at chosen layers to steer LLMs, achieving superior control-drift balance and up to 118x storage reduction on personality and structured-reasoning tasks.
A Study on Hidden Layer Distillation for Large Language Model Pre-Training cs.CL · 2026-05-12 · unverdicted · none · ref 38
Hidden layer distillation yields systematic perplexity gains over logit KD in LLM pre-training but does not consistently improve downstream performance.

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer