Advancing trans- former architecture in long-context large language models: A comprehensive survey

Advancing transformer architecture in long-context large language models: A comprehensive survey , author= · 2024 · arXiv 2311.12351

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

PARTREP: Learning What to Repeat for Decoder-only LLMs

cs.CL · 2026-07-02 · conditional · novelty 6.0

PartRep selects high-NLL tokens via a lightweight early-exit gate for partial prompt repetition, retaining most full-repetition gains at 59.4% KV cache and 79% prefill FLOPs on eight benchmarks.

CHEM: Estimating and Understanding Hallucinations in Deep Learning for Image Processing

cs.CV · 2025-12-10 · unverdicted · novelty 6.0

The paper defines the Conformal Hallucination Estimation Metric (CHEM) that localizes hallucination-prone regions in image reconstruction models via multiscale representations and distribution-free conformal regression.

A Survey on the Memory Mechanism of Large Language Model based Agents

cs.AI · 2024-04-21 · accept · novelty 3.0

A systematic review of memory designs, evaluation methods, applications, limitations, and future directions for LLM-based agents.

citing papers explorer

Showing 1 of 1 citing paper after filters.

PARTREP: Learning What to Repeat for Decoder-only LLMs cs.CL · 2026-07-02 · conditional · none · ref 4
PartRep selects high-NLL tokens via a lightweight early-exit gate for partial prompt repetition, retaining most full-repetition gains at 59.4% KV cache and 79% prefill FLOPs on eight benchmarks.

Advancing trans- former architecture in long-context large language models: A comprehensive survey

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer