Chain-of-thought prompting, by including intermediate reasoning steps in few-shot examples, elicits strong reasoning abilities in large language models on arithmetic, commonsense, and symbolic tasks.
Zhao, Eric Wallace, Shi Feng, Dan Klein, and Sameer Singh
9 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 2polarities
background 2representative citing papers
An unsupervised technique extracts latent yes-no knowledge from language model activations by locating a direction that satisfies logical consistency properties, outperforming zero-shot accuracy by 4% on average across models and datasets.
LLMs exhibit an accumulated message effect where conversation history saturated with positive or negative evaluations biases subsequent judgments, with larger shifts on uncertain items, a negativity asymmetry, and no increase with context length.
Fine-tuning 7B code LLMs on a custom multi-file DSL dataset achieves structural fidelity of 1.00, high exact-match accuracy, and practical utility validated by expert survey and execution checks.
Scaling and instruction tuning increase sycophancy in LLMs on opinion and fact tasks, but a synthetic data fine-tuning intervention reduces it on held-out prompts.
Emergent abilities are capabilities present in large language models but absent in smaller ones and cannot be predicted by extrapolating smaller model performance.
In-context learning shows persistent interference from prior examples, with more misleading linear examples degrading quadratic predictions and training curricula modulating recovery speed.
SSAS improves LLM sentiment prediction consistency and data quality by up to 30% on three review datasets via syntactic and semantic context assessment summarization.
Introduces Explicit Logic Channel (ELC) with LLM, VFM and probabilistic inference for validating, selecting and enhancing MLLMs on zero-shot tasks using Consistency Rate and cross-channel integration.
citing papers explorer
-
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-thought prompting, by including intermediate reasoning steps in few-shot examples, elicits strong reasoning abilities in large language models on arithmetic, commonsense, and symbolic tasks.
-
Discovering Latent Knowledge in Language Models Without Supervision
An unsupervised technique extracts latent yes-no knowledge from language model activations by locating a direction that satisfies logical consistency properties, outperforming zero-shot accuracy by 4% on average across models and datasets.
-
AMEL: Accumulated Message Effects on LLM Judgments
LLMs exhibit an accumulated message effect where conversation history saturated with positive or negative evaluations biases subsequent judgments, with larger shifts on uncertain items, a negativity asymmetry, and no increase with context length.
-
Leveraging LLMs for Multi-File DSL Code Generation: An Industrial Case Study
Fine-tuning 7B code LLMs on a custom multi-file DSL dataset achieves structural fidelity of 1.00, high exact-match accuracy, and practical utility validated by expert survey and execution checks.
-
Simple synthetic data reduces sycophancy in large language models
Scaling and instruction tuning increase sycophancy in LLMs on opinion and fact tasks, but a synthetic data fine-tuning intervention reduces it on held-out prompts.
-
Emergent Abilities of Large Language Models
Emergent abilities are capabilities present in large language models but absent in smaller ones and cannot be predicted by extrapolating smaller model performance.
-
When Context Sticks: Studying Interference in In-Context Learning
In-context learning shows persistent interference from prior examples, with more misleading linear examples degrading quadratic predictions and training curricula modulating recovery speed.
-
Consistency Analysis of Sentiment Predictions using Syntactic & Semantic Context Assessment Summarization (SSAS)
SSAS improves LLM sentiment prediction consistency and data quality by up to 30% on three review datasets via syntactic and semantic context assessment summarization.
-
Explicit Logic Channel for Validation and Enhancement of MLLMs on Zero-Shot Tasks
Introduces Explicit Logic Channel (ELC) with LLM, VFM and probabilistic inference for validating, selecting and enhancing MLLMs on zero-shot tasks using Consistency Rate and cross-channel integration.