Language models are few-shot learners

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al · 1901

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

Provable Knowledge Acquisition and Extraction in One-Layer Transformers

cs.LG · 2025-07-28 · unverdicted · novelty 6.0

In a stylized one-layer transformer, pre-training encodes factual knowledge via relation-specific feature directions and attention patterns; fine-tuning extracts it through a relation-covering mechanism that succeeds when enough latent templates are triggered, with a failure regime explaining inauds

Through the Stealth Lens: Attention-Aware Defenses Against Poisoning in RAG

cs.CR · 2025-06-04 · unverdicted · novelty 5.0

Introduces NPAS and AV Filter using LLM attention weights to defend RAG against poisoning, reporting up to 20% accuracy gains while adaptive attacks reach 35% success.

citing papers explorer

Showing 2 of 2 citing papers.

Provable Knowledge Acquisition and Extraction in One-Layer Transformers cs.LG · 2025-07-28 · unverdicted · none · ref 5
In a stylized one-layer transformer, pre-training encodes factual knowledge via relation-specific feature directions and attention patterns; fine-tuning extracts it through a relation-covering mechanism that succeeds when enough latent templates are triggered, with a failure regime explaining inauds
Through the Stealth Lens: Attention-Aware Defenses Against Poisoning in RAG cs.CR · 2025-06-04 · unverdicted · none · ref 2
Introduces NPAS and AV Filter using LLM attention weights to defend RAG against poisoning, reporting up to 20% accuracy gains while adaptive attacks reach 35% success.

Language models are few-shot learners

fields

years

verdicts

representative citing papers

citing papers explorer