Which Examples to Annotate for In-Context Learning? Towards Effective and Efficient Selection

Balasubramaniam Srinivasan; Christos Faloutsos; Costas Mavromatis; George Karypis; Huzefa Rangwala; Jiani Zhang; Zhengyuan Shen

arxiv: 2310.20046 · v1 · pith:VEARAQTWnew · submitted 2023-10-30 · 💻 cs.CL

Which Examples to Annotate for In-Context Learning? Towards Effective and Efficient Selection

Costas Mavromatis , Balasubramaniam Srinivasan , Zhengyuan Shen , Jiani Zhang , Huzefa Rangwala , Christos Faloutsos , George Karypis This is my paper

classification 💻 cs.CL

keywords examplesadaiclimproveslearningsamplingbudgetdiversity-basedefficient

0 comments

read the original abstract

Large Language Models (LLMs) can adapt to new tasks via in-context learning (ICL). ICL is efficient as it does not require any parameter updates to the trained LLM, but only few annotated examples as input for the LLM. In this work, we investigate an active learning approach for ICL, where there is a limited budget for annotating examples. We propose a model-adaptive optimization-free algorithm, termed AdaICL, which identifies examples that the model is uncertain about, and performs semantic diversity-based example selection. Diversity-based sampling improves overall effectiveness, while uncertainty sampling improves budget efficiency and helps the LLM learn new information. Moreover, AdaICL poses its sampling strategy as a Maximum Coverage problem, that dynamically adapts based on the model's feedback and can be approximately solved via greedy algorithms. Extensive experiments on nine datasets and seven LLMs show that AdaICL improves performance by 4.4% accuracy points over SOTA (7.7% relative improvement), is up to 3x more budget-efficient than performing annotations uniformly at random, while it outperforms SOTA with 2x fewer ICL examples.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Activation-Based Active Learning for In-Context Learning: Challenges and Insights
cs.CL 2026-06 unverdicted novelty 6.0

MLP activations measured as massive activations or first four moments correlate weakly (max |Spearman| = 0.33) with in-context example quality across Llama-3.2-3B, Qwen2.5-3B, and multiple classification/generative ta...
SnapAudit: Active Auditing of Differentially Private In-Context Learning via Snapshot-Based Simulation
cs.CR 2025-11 conditional novelty 6.0

SnapAudit decomposes DP-ICL into a deterministic snapshot stage and a stochastic noise stage, using bootstrap simulation to achieve 80-200x faster auditing and exposing privacy bound violations in existing Gaussian an...