Toy models demonstrate that polysemanticity arises when neural networks store more sparse features than neurons via superposition, producing a phase transition tied to polytope geometry and increased adversarial vulnerability.
An Introduction to Systems Biology: Design Principles of Biological Circuits
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 2polarities
background 2representative citing papers
PUICL is a transformer pretrained on synthetic PU data from structural causal models that solves positive-unlabeled classification via in-context learning without gradient updates or fitting.
Sentiment is represented as a single linear direction in LLM activation space that is causally relevant across tasks and is summarized at punctuation and names in addition to charged words.
citing papers explorer
-
Toy Models of Superposition
Toy models demonstrate that polysemanticity arises when neural networks store more sparse features than neurons via superposition, producing a phase transition tied to polytope geometry and increased adversarial vulnerability.
-
In-Context Positive-Unlabeled Learning
PUICL is a transformer pretrained on synthetic PU data from structural causal models that solves positive-unlabeled classification via in-context learning without gradient updates or fitting.
-
Linear Representations of Sentiment in Large Language Models
Sentiment is represented as a single linear direction in LLM activation space that is causally relevant across tasks and is summarized at punctuation and names in addition to charged words.