Uncovering hidden geometry in transformers via disentangling position and context

[SZ23] J · 2024 · arXiv 2310.04861

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

read on arXiv browse 6 citing papers

citation-role summary

background 1 method 1

citation-polarity summary

unclear 1 use method 1

representative citing papers

Manifold Steering Reveals the Shared Geometry of Neural Network Representation and Behavior

cs.LG · 2026-05-06 · unverdicted · novelty 7.0

Manifold steering along activation geometry induces behavioral trajectories matching the natural manifold of outputs, while linear steering produces off-manifold unnatural behaviors.

Task Vector Geometry Underlies Dual Modes of Task Inference in Transformers

cs.LG · 2026-05-05 · unverdicted · novelty 7.0

In a controlled synthetic setting, transformers implement in-distribution task inference via convex combinations of task vectors and out-of-distribution inference via nearly orthogonal extrapolative representations.

Arithmetic in the Wild: Llama uses Base-10 Addition to Reason About Cyclic Concepts

cs.AI · 2026-05-01 · unverdicted · novelty 7.0

Llama-3.1-8B computes sums for cyclic concepts using base-10 addition via task-agnostic Fourier features with periods 2, 5, and 10 rather than modular arithmetic in the concept period.

Give it Space! Explicit Disentangling of Positional and Semantic Representations in Encoders

cs.CL · 2026-05-28 · unverdicted · novelty 6.0

Explicitly disentangling semantic and positional streams in a Transformer encoder reveals that absolute positional representations collapse to a 2D document-structure manifold, attention heads specialize by role, and the approach improves linguistic probing performance on 49 of 65 phenomena.

RDP LoRA: Geometry-Driven Identification for Parameter-Efficient Adaptation in Large Language Models

cs.LG · 2026-04-21 · unverdicted · novelty 6.0

RDP-selected 13 layers for LoRA on Qwen3-8B-Base reach 81.67% on MMLU-Math, beating full 36-layer adaptation at 79.32% and random 13-layer selection at 75.56%.

High-Dimensional Statistics: Reflections on Progress and Open Problems

math.ST · 2026-05-06 · unverdicted · novelty 2.0 · 2 refs

This review synthesizes representative advances in high-dimensional statistics, highlights common themes and open problems, and points to key entry works.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Manifold Steering Reveals the Shared Geometry of Neural Network Representation and Behavior cs.LG · 2026-05-06 · unverdicted · none · ref 13
Manifold steering along activation geometry induces behavioral trajectories matching the natural manifold of outputs, while linear steering produces off-manifold unnatural behaviors.

Uncovering hidden geometry in transformers via disentangling position and context

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer