Language models fail to introspect about their knowledge of language, 2025 a

Siyuan Song, Jennifer Hu, Kyle Mahowald · 2025 · arXiv 2503.07513

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

Do Activation Verbalization Methods Convey Privileged Information?

cs.CL · 2025-09-16 · unverdicted · novelty 5.0

Activation verbalization methods for LLMs largely reflect the verbalizer model's parametric knowledge rather than privileged information from the target model's activations.

citing papers explorer

Showing 1 of 1 citing paper.

Do Activation Verbalization Methods Convey Privileged Information? cs.CL · 2025-09-16 · unverdicted · none · ref 49
Activation verbalization methods for LLMs largely reflect the verbalizer model's parametric knowledge rather than privileged information from the target model's activations.

Language models fail to introspect about their knowledge of language, 2025 a

fields

years

verdicts

representative citing papers

citing papers explorer