hub

In: Gurevych, I., Miyao, Y

Howard, J · 2018 · DOI 10.18653/v1/p18-1031

10 Pith papers cite this work. Polarity classification is still indexing.

10 Pith papers citing it

open at publisher browse 10 citing papers

hub tools

JSON dossier citing papers JSON publisher DOI

citation-role summary

background 3

citation-polarity summary

background 2 support 1

representative citing papers

Oracle Supervision Transfers for Hyperparameter Prediction in Model-Based Image Denoising

cs.CV · 2026-05-19 · conditional · novelty 7.0

HyperDn is a configuration-conditioned predictor that transfers oracle supervision across denoising paradigms to achieve near-oracle hyperparameter prediction with few or zero target labels.

TILT: Target-induced loss tilting under covariate shift

cs.LG · 2026-05-14 · conditional · novelty 7.0

TILT adds a target-data penalty on an auxiliary predictor component to induce effective importance weighting for unsupervised domain adaptation under covariate shift.

LLM-guided Semi-Supervised Approaches for Social Media Crisis Data Classification

cs.AI · 2026-05-08 · conditional · novelty 7.0

LG-CoTrain, an LLM-guided co-training method, outperforms classical semi-supervised baselines for crisis tweet classification in low-resource settings with 5-25 labeled examples per class.

The Power of Scale for Parameter-Efficient Prompt Tuning

cs.CL · 2021-04-18 · unverdicted · novelty 7.0

Prompt tuning matches full model tuning performance on large language models while tuning only a small fraction of parameters and improves robustness to domain shifts.

How Many Different Outputs Can a Transformer Generate?

cs.LG · 2026-05-21 · unverdicted · novelty 6.0

Transformers are limited to a linearly growing number of accessible output sequences with prompt length, with exponential decay in accessible proportion beyond a critical point, even under unbounded context.

PatchTrack: A Comprehensive Analysis of ChatGPT's Influence on Pull Request Outcomes

cs.SE · 2025-05-12 · conditional · novelty 6.0

Empirical analysis of 338 PRs with self-admitted ChatGPT usage shows low full integration (median 25%), selective adaptation patterns, and broader influence on developer reasoning during reviews.

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

cs.CL · 2024-06-25 · unverdicted · novelty 6.0

FineWeb is a curated 15T-token web dataset that produces stronger LLMs than prior open collections, while its educational subset sharply improves performance on MMLU and ARC benchmarks.

Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes

cs.CL · 2023-05-03 · conditional · novelty 6.0

Distilling step-by-step uses LLM-generated rationales as additional supervision in a multi-task framework so that 770M-parameter models outperform 540B-parameter models on NLP benchmarks with only 80% of the data.

BloombergGPT: A Large Language Model for Finance

cs.LG · 2023-03-30 · conditional · novelty 6.0

BloombergGPT is a 50B parameter LLM trained on a 708B token mixed financial and general dataset that outperforms prior models on financial benchmarks while preserving general LLM performance.

PaLM 2 Technical Report

cs.CL · 2023-05-17 · unverdicted · novelty 5.0

PaLM 2 reports state-of-the-art results on language, reasoning, and multilingual tasks with improved efficiency over PaLM.

citing papers explorer

Showing 10 of 10 citing papers.

Oracle Supervision Transfers for Hyperparameter Prediction in Model-Based Image Denoising cs.CV · 2026-05-19 · conditional · none · ref 10
HyperDn is a configuration-conditioned predictor that transfers oracle supervision across denoising paradigms to achieve near-oracle hyperparameter prediction with few or zero target labels.
TILT: Target-induced loss tilting under covariate shift cs.LG · 2026-05-14 · conditional · none · ref 90
TILT adds a target-data penalty on an auxiliary predictor component to induce effective importance weighting for unsupervised domain adaptation under covariate shift.
LLM-guided Semi-Supervised Approaches for Social Media Crisis Data Classification cs.AI · 2026-05-08 · conditional · none · ref 91
LG-CoTrain, an LLM-guided co-training method, outperforms classical semi-supervised baselines for crisis tweet classification in low-resource settings with 5-25 labeled examples per class.
The Power of Scale for Parameter-Efficient Prompt Tuning cs.CL · 2021-04-18 · unverdicted · none · ref 17
Prompt tuning matches full model tuning performance on large language models while tuning only a small fraction of parameters and improves robustness to domain shifts.
How Many Different Outputs Can a Transformer Generate? cs.LG · 2026-05-21 · unverdicted · none · ref 26
Transformers are limited to a linearly growing number of accessible output sequences with prompt length, with exponential decay in accessible proportion beyond a critical point, even under unbounded context.
PatchTrack: A Comprehensive Analysis of ChatGPT's Influence on Pull Request Outcomes cs.SE · 2025-05-12 · conditional · none · ref 32
Empirical analysis of 338 PRs with self-admitted ChatGPT usage shows low full integration (median 25%), selective adaptation patterns, and broader influence on developer reasoning during reviews.
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale cs.CL · 2024-06-25 · unverdicted · none · ref 14
FineWeb is a curated 15T-token web dataset that produces stronger LLMs than prior open collections, while its educational subset sharply improves performance on MMLU and ARC benchmarks.
Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes cs.CL · 2023-05-03 · conditional · none · ref 79
Distilling step-by-step uses LLM-generated rationales as additional supervision in a multi-task framework so that 770M-parameter models outperform 540B-parameter models on NLP benchmarks with only 80% of the data.
BloombergGPT: A Large Language Model for Finance cs.LG · 2023-03-30 · conditional · none · ref 46
BloombergGPT is a 50B parameter LLM trained on a 708B token mixed financial and general dataset that outperforms prior models on financial benchmarks while preserving general LLM performance.
PaLM 2 Technical Report cs.CL · 2023-05-17 · unverdicted · none · ref 66
PaLM 2 reports state-of-the-art results on language, reasoning, and multilingual tasks with improved efficiency over PaLM.

In: Gurevych, I., Miyao, Y

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer