Uncovering hidden geometry in transformers via disentangling position and context

Jiajun Song, Yiqiao Zhong · 2023 · arXiv 2310.04861

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

read on arXiv browse 5 citing papers

citation-role summary

background 1 method 1

citation-polarity summary

unclear 1 use method 1

representative citing papers

Manifold Steering Reveals the Shared Geometry of Neural Network Representation and Behavior

cs.LG · 2026-05-06 · unverdicted · novelty 7.0

Manifold steering along activation geometry induces behavioral trajectories matching the natural manifold of outputs, while linear steering produces off-manifold unnatural behaviors.

Task Vector Geometry Underlies Dual Modes of Task Inference in Transformers

cs.LG · 2026-05-05 · unverdicted · novelty 7.0

In a controlled synthetic setting, transformers implement in-distribution task inference via convex combinations of task vectors and out-of-distribution inference via nearly orthogonal extrapolative representations.

Arithmetic in the Wild: Llama uses Base-10 Addition to Reason About Cyclic Concepts

cs.AI · 2026-05-01 · unverdicted · novelty 7.0

Llama-3.1-8B computes sums for cyclic concepts using base-10 addition via task-agnostic Fourier features with periods 2, 5, and 10 rather than modular arithmetic in the concept period.

RDP LoRA: Geometry-Driven Identification for Parameter-Efficient Adaptation in Large Language Models

cs.LG · 2026-04-21 · unverdicted · novelty 6.0

RDP-selected 13 layers for LoRA on Qwen3-8B-Base reach 81.67% on MMLU-Math, beating full 36-layer adaptation at 79.32% and random 13-layer selection at 75.56%.

High-Dimensional Statistics: Reflections on Progress and Open Problems

math.ST · 2026-05-06

citing papers explorer

Showing 5 of 5 citing papers.

Manifold Steering Reveals the Shared Geometry of Neural Network Representation and Behavior cs.LG · 2026-05-06 · unverdicted · none · ref 13
Manifold steering along activation geometry induces behavioral trajectories matching the natural manifold of outputs, while linear steering produces off-manifold unnatural behaviors.
Task Vector Geometry Underlies Dual Modes of Task Inference in Transformers cs.LG · 2026-05-05 · unverdicted · none · ref 37
In a controlled synthetic setting, transformers implement in-distribution task inference via convex combinations of task vectors and out-of-distribution inference via nearly orthogonal extrapolative representations.
Arithmetic in the Wild: Llama uses Base-10 Addition to Reason About Cyclic Concepts cs.AI · 2026-05-01 · unverdicted · none · ref 13
Llama-3.1-8B computes sums for cyclic concepts using base-10 addition via task-agnostic Fourier features with periods 2, 5, and 10 rather than modular arithmetic in the concept period.
RDP LoRA: Geometry-Driven Identification for Parameter-Efficient Adaptation in Large Language Models cs.LG · 2026-04-21 · unverdicted · none · ref 7
RDP-selected 13 layers for LoRA on Qwen3-8B-Base reach 81.67% on MMLU-Math, beating full 36-layer adaptation at 79.32% and random 13-layer selection at 75.56%.
High-Dimensional Statistics: Reflections on Progress and Open Problems math.ST · 2026-05-06 · unreviewed · ref 91

Uncovering hidden geometry in transformers via disentangling position and context

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer