C og S teer: Cognition-Inspired Selective Layer Intervention for Efficiently Steering Large Language Models

Wang, Xintong, Pan, Jingheng, Ding, Liang, Wang, Longyue, Jiang, Longqin, Li, Xingshan · 2025 · DOI 10.18653/v1/2025.findings-acl.1308

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open at publisher browse 2 citing papers

representative citing papers

Adversarial Robustness of Activation Steering in Large Language Models

cs.LG · 2026-06-05 · unverdicted · novelty 7.0

First systematic test shows activation steering robustness drops sharply (up to 64%) under adversarial input perturbations across multiple extraction methods, models, and personas.

TALAN: Task-Aligned Latent Adaptation Networks for Targeted Post-Training of Large Language Models

cs.LG · 2026-06-05 · unverdicted · novelty 6.0

TALAN inserts a trainable latent memory path that remixes sequence information into small orthogonal perturbations, delivering 1.41-1.85 point average gains over matched LoRA and DoRA on four Qwen backbones and STEM/code benchmarks while adding under 1% parameters.

citing papers explorer

Showing 2 of 2 citing papers after filters.

Adversarial Robustness of Activation Steering in Large Language Models cs.LG · 2026-06-05 · unverdicted · none · ref 30
First systematic test shows activation steering robustness drops sharply (up to 64%) under adversarial input perturbations across multiple extraction methods, models, and personas.
TALAN: Task-Aligned Latent Adaptation Networks for Targeted Post-Training of Large Language Models cs.LG · 2026-06-05 · unverdicted · none · ref 23
TALAN inserts a trainable latent memory path that remixes sequence information into small orthogonal perturbations, delivering 1.41-1.85 point average gains over matched LoRA and DoRA on four Qwen backbones and STEM/code benchmarks while adding under 1% parameters.

C og S teer: Cognition-Inspired Selective Layer Intervention for Efficiently Steering Large Language Models

fields

years

verdicts

representative citing papers

citing papers explorer