The Few-shot Dilemma: Over-prompting Large Language Models

Christian Koerner; Doruk Tuncel; Thomas Runkler; Yongjian Tang

arxiv: 2509.13196 · v1 · pith:MESO4A3Unew · submitted 2025-09-16 · 💻 cs.CL

The Few-shot Dilemma: Over-prompting Large Language Models

Yongjian Tang , Doruk Tuncel , Christian Koerner , Thomas Runkler This is my paper

classification 💻 cs.CL

keywords few-shotexamplesllmsover-promptingperformancedilemmaexcessivelanguage

0 comments

read the original abstract

Over-prompting, a phenomenon where excessive examples in prompts lead to diminished performance in Large Language Models (LLMs), challenges the conventional wisdom about in-context few-shot learning. To investigate this few-shot dilemma, we outline a prompting framework that leverages three standard few-shot selection methods - random sampling, semantic embedding, and TF-IDF vectors - and evaluate these methods across multiple LLMs, including GPT-4o, GPT-3.5-turbo, DeepSeek-V3, Gemma-3, LLaMA-3.1, LLaMA-3.2, and Mistral. Our experimental results reveal that incorporating excessive domain-specific examples into prompts can paradoxically degrade performance in certain LLMs, which contradicts the prior empirical conclusion that more relevant few-shot examples universally benefit LLMs. Given the trend of LLM-assisted software engineering and requirement analysis, we experiment with two real-world software requirement classification datasets. By gradually increasing the number of TF-IDF-selected and stratified few-shot examples, we identify their optimal quantity for each LLM. This combined approach achieves superior performance with fewer examples, avoiding the over-prompting problem, thus surpassing the state-of-the-art by 1% in classifying functional and non-functional requirements.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Sustainability Analysis of Prompt Strategies for SLM-based Automated Test Generation
cs.SE 2026-04 unverdicted novelty 6.0

Prompt strategies for SLM-based automated test generation vary widely in energy consumption and carbon emissions, with simpler strategies delivering competitive coverage at markedly lower environmental cost.
The PICCO Framework for Large Language Model Prompting: A Taxonomy and Reference Architecture for Prompt Structure
cs.CL 2026-04 accept novelty 5.0

PICCO is a five-element reference architecture (Persona, Instructions, Context, Constraints, Output) for structuring LLM prompts, derived from synthesizing prior frameworks along with a taxonomy distinguishing prompt ...