pith. sign in

hub Canonical reference

In-context vectors: Making in context learning more effective and controllable through latent space steering

Canonical reference. 80% of citing Pith papers cite this work as background.

11 Pith papers citing it
Background 80% of classified citations

hub tools

citation-role summary

background 4 method 1

citation-polarity summary

fields

cs.CL 7 cs.LG 4

representative citing papers

Steering Language Models With Activation Engineering

cs.CL · 2023-08-20 · unverdicted · novelty 7.0

Activation Addition steers language models by adding contrastive activation vectors from prompt pairs to control high-level properties like sentiment and toxicity at inference time without training.

Dictionary-Aligned Concept Control for Safeguarding Multimodal LLMs

cs.LG · 2026-04-10 · unverdicted · novelty 6.0

DACO curates a 15,000-concept dictionary from 400K image-caption pairs and uses it to initialize an SAE that enables granular, concept-specific steering of MLLM activations, raising safety scores on MM-SafetyBench and JailBreakV while preserving general capabilities.

Steering Llama 2 via Contrastive Activation Addition

cs.CL · 2023-12-09 · unverdicted · novelty 6.0

Contrastive Activation Addition steers Llama 2 Chat by adding averaged residual-stream activation differences from contrastive example pairs to control targeted behaviors at inference time.

MemOS: A Memory OS for AI System

cs.CL · 2025-07-04 · unverdicted · novelty 5.0

MemOS introduces a unified memory management framework for LLMs using MemCubes to handle and evolve different memory types for improved controllability and evolvability.

citing papers explorer

Showing 11 of 11 citing papers.