pith. sign in

Title resolution pending

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

citation-role summary

background 1

citation-polarity summary

fields

cs.LG 3

verdicts

UNVERDICTED 3

roles

background 1

polarities

unclear 1

representative citing papers

Dataset Distillation

cs.LG · 2018-11-27 · unverdicted · novelty 8.0

Dataset distillation creates a tiny synthetic training set that, when used with a fixed network initialization, produces models whose performance approximates that of models trained on the full original dataset.

citing papers explorer

Showing 3 of 3 citing papers.

  • Dataset Distillation cs.LG · 2018-11-27 · unverdicted · none · ref 44

    Dataset distillation creates a tiny synthetic training set that, when used with a fixed network initialization, produces models whose performance approximates that of models trained on the full original dataset.

  • Training Language Models to Self-Correct via Reinforcement Learning cs.LG · 2024-09-19 · unverdicted · none · ref 275

    SCoRe uses multi-turn online RL with regularization on self-generated traces to improve LLM self-correction, achieving 15.6% and 9.1% gains on MATH and HumanEval for Gemini models.

  • Is Conditional Generative Modeling all you need for Decision-Making? cs.LG · 2022-11-28 · unverdicted · none · ref 52

    Return-conditional diffusion models for policies outperform offline RL on benchmarks by circumventing dynamic programming and enable constraint or skill composition.