pith. sign in

Making pre-trained language models better few-shot learners

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

citation-role summary

background 1

citation-polarity summary

roles

background 1

polarities

background 1

representative citing papers

Large Language Models as Optimizers

cs.LG · 2023-09-07 · unverdicted · novelty 7.0

Large language models can optimize by being prompted with histories of past solutions and scores to propose better ones, producing prompts that raise accuracy up to 8% on GSM8K and 50% on Big-Bench Hard over human-designed baselines.

Cognitive Architectures for Language Agents

cs.AI · 2023-09-05 · accept · novelty 6.0

CoALA is a modular cognitive architecture for language agents that organizes memory components, action spaces for internal and external interaction, and a generalized decision-making loop to support more systematic development of capable agents.

On-Device Fine-Tuning via Backprop-Free Zeroth-Order Optimization

cs.LG · 2025-11-14 · unverdicted · novelty 5.0

MeZO enables larger models for on-device fine-tuning by estimating gradients via forward passes only, with theoretical size estimates and numerical results showing accuracy benefits when wall-clock time is sufficient.

Test-Time Alignment via Hypothesis Reweighting

cs.LG · 2024-12-11 · unverdicted · novelty 5.0

HyRe personalizes reward models at test time by reweighting an ensemble of heads trained on aggregate preferences, using few target examples to outperform uniform averaging and prior methods on RewardBench and 32 tasks.

On the Power of Foundation Models

cs.AI · 2022-11-29 · unverdicted · novelty 5.0

Category theory proves prompt-based learning on perfect foundation models works only for representable tasks, fine-tuning solves tasks in the pretext category, and models can represent unseen target-category objects using source-category structure.

citing papers explorer

Showing 7 of 7 citing papers.

  • Generative Agents: Interactive Simulacra of Human Behavior cs.HC · 2023-04-07 · accept · none · ref 41

    Generative agents with memory streams, reflection, and planning using LLMs exhibit believable individual and emergent social behaviors in a simulated town.

  • Graph Topology Information Enhanced Heterogeneous Graph Representation Learning cs.LG · 2026-04-07 · unverdicted · none · ref 9

    ToGRL learns high-quality graph structures from raw heterogeneous graphs via a two-stage topology extraction process and prompt tuning, outperforming prior methods on five datasets.

  • Large Language Models as Optimizers cs.LG · 2023-09-07 · unverdicted · none · ref 10

    Large language models can optimize by being prompted with histories of past solutions and scores to propose better ones, producing prompts that raise accuracy up to 8% on GSM8K and 50% on Big-Bench Hard over human-designed baselines.

  • Cognitive Architectures for Language Agents cs.AI · 2023-09-05 · accept · none · ref 26

    CoALA is a modular cognitive architecture for language agents that organizes memory components, action spaces for internal and external interaction, and a generalized decision-making loop to support more systematic development of capable agents.

  • On-Device Fine-Tuning via Backprop-Free Zeroth-Order Optimization cs.LG · 2025-11-14 · unverdicted · none · ref 28

    MeZO enables larger models for on-device fine-tuning by estimating gradients via forward passes only, with theoretical size estimates and numerical results showing accuracy benefits when wall-clock time is sufficient.

  • Test-Time Alignment via Hypothesis Reweighting cs.LG · 2024-12-11 · unverdicted · none · ref 19

    HyRe personalizes reward models at test time by reweighting an ensemble of heads trained on aggregate preferences, using few target examples to outperform uniform averaging and prior methods on RewardBench and 32 tasks.

  • On the Power of Foundation Models cs.AI · 2022-11-29 · unverdicted · none · ref 28

    Category theory proves prompt-based learning on perfect foundation models works only for representable tasks, fine-tuning solves tasks in the pretext category, and models can represent unseen target-category objects using source-category structure.