Title resolution pending

Model editing with canonical examples · 2023 · arXiv 2402.06155

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

Sense Representations Are Inducible Interfaces

cs.CL · 2026-05-27 · unverdicted · novelty 7.0

ACROS induces explicit sense representations in frozen decoder LMs via gated residual addition, enabling competitive zero-shot WSD, lexical steering, and cross-lingual adaptation on SmolLM2-360M while preserving base quality.

AI as a Tool for Simulation-Based Experiments in Literary Studies

cs.CL · 2026-06-01 · unverdicted · novelty 4.0

Proposes AI-driven simulations for literary-historical experiments and reports preliminary text-generation results claiming the first limited in-distribution outputs matching human novels.

Towards Intrinsic Interpretability of Large Language Models:A Survey of Design Principles and Architectures

cs.CL · 2026-04-17 · unverdicted · novelty 4.0

This survey organizes intrinsic interpretability approaches for LLMs into five categories—functional transparency, concept alignment, representational decomposability, explicit modularization, and latent sparsity induction—while discussing challenges and future directions.

citing papers explorer

Showing 3 of 3 citing papers after filters.

Sense Representations Are Inducible Interfaces cs.CL · 2026-05-27 · unverdicted · none · ref 3
ACROS induces explicit sense representations in frozen decoder LMs via gated residual addition, enabling competitive zero-shot WSD, lexical steering, and cross-lingual adaptation on SmolLM2-360M while preserving base quality.
AI as a Tool for Simulation-Based Experiments in Literary Studies cs.CL · 2026-06-01 · unverdicted · none · ref 27
Proposes AI-driven simulations for literary-historical experiments and reports preliminary text-generation results claiming the first limited in-distribution outputs matching human novels.
Towards Intrinsic Interpretability of Large Language Models:A Survey of Design Principles and Architectures cs.CL · 2026-04-17 · unverdicted · none · ref 3
This survey organizes intrinsic interpretability approaches for LLMs into five categories—functional transparency, concept alignment, representational decomposability, explicit modularization, and latent sparsity induction—while discussing challenges and future directions.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer