MASS-DPO derives a Plackett-Luce-specific log-determinant Fisher information objective to select non-redundant negative samples, matching or exceeding multi-negative DPO performance with substantially fewer negatives across four benchmarks and three model families.
hub
Deep batch active learning by diverse, uncertain gradient lower bounds
13 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 13roles
background 3representative citing papers
CUTAL scores multi-frame clips for uncertainty and enforces temporal diversity to train transformer MOT models to near full-supervision performance with 50% of the labels.
UnIte selects target-domain documents for pseudo-query generation by filtering high aleatoric uncertainty and prioritizing high epistemic uncertainty, yielding +2.45 to +3.49 nDCG@10 gains on BEIR with ~4k samples.
STAP reduces training data costs for PDE surrogates by selectively acquiring key time steps per trajectory instead of full simulations.
Introduces the first active learning framework for unaligned multimodal data that selects alignments using uncertainty and diversity to cut annotation costs by up to 40% on benchmarks while preserving accuracy.
Active inference adapts label collection via ML uncertainty to deliver valid statistical inference with substantially fewer samples than standard non-adaptive methods across any data distribution.
Language models fine-tuned via RL on 5k-60k human preference comparisons produce stylistically better text continuations and human-preferred summaries that sometimes copy input sentences.
POES frames prompt evaluation as online adaptive testing and uses a provably submodular objective to pick informative examples, delivering 6.2% higher average accuracy and 35-60% token savings versus naive full-set scoring.
Active learning with randomly initialized models achieves comparable results to traditional candidate-model methods, with low-confidence sampling proving most effective.
UGEL employs deep beta regression to estimate uncertainty in one forward pass, enabling faster convergence in edge learning for remote sensing image regression than active or semi-supervised baselines.
A differentiable neural operator learns the mapping from granular microstructure configurations to failure envelopes, with physics-informed convexity enforcement and active learning for efficient training.
BRAL-T uses TrustSet-guided reinforcement learning for batch active learning and reports state-of-the-art results on 10 image classification benchmarks plus 2 fine-tuning tasks.
ShieldGemma delivers a family of Gemma2-based classifiers that outperform Llama Guard and WildCard on public safety benchmarks while introducing a synthetic-data curation pipeline for safety tasks.
citing papers explorer
-
MASS-DPO: Multi-negative Active Sample Selection for Direct Policy Optimization
MASS-DPO derives a Plackett-Luce-specific log-determinant Fisher information objective to select non-redundant negative samples, matching or exceeding multi-negative DPO performance with substantially fewer negatives across four benchmarks and three model families.
-
Clip-level Uncertainty and Temporal-aware Active Learning for End-to-End Multi-Object Tracking
CUTAL scores multi-frame clips for uncertainty and enforces temporal diversity to train transformer MOT models to near full-supervision performance with 50% of the labels.
-
UnIte: Uncertainty-based Iterative Document Sampling for Domain Adaptation in Information Retrieval
UnIte selects target-domain documents for pseudo-query generation by filtering high aleatoric uncertainty and prioritizing high epistemic uncertainty, yielding +2.45 to +3.49 nDCG@10 gains on BEIR with ~4k samples.
-
Active Learning with Selective Time-Step Acquisition for PDEs
STAP reduces training data costs for PDE surrogates by selectively acquiring key time steps per trajectory instead of full simulations.
-
Towards Multimodal Active Learning: Efficient Learning with Limited Paired Data
Introduces the first active learning framework for unaligned multimodal data that selects alignments using uncertainty and diversity to cut annotation costs by up to 40% on benchmarks while preserving accuracy.
-
Active Statistical Inference
Active inference adapts label collection via ML uncertainty to deliver valid statistical inference with substantially fewer samples than standard non-adaptive methods across any data distribution.
-
Fine-Tuning Language Models from Human Preferences
Language models fine-tuned via RL on 5k-60k human preference comparisons produce stylistically better text continuations and human-preferred summaries that sometimes copy input sentences.
-
Select Smarter, Not More: Prompt-Aware Evaluation Scheduling with Submodular Guarantees
POES frames prompt evaluation as online adaptive testing and uses a provably submodular objective to pick informative examples, delivering 6.2% higher average accuracy and 35-60% token savings versus naive full-set scoring.
-
Are Candidate Models Really Needed for Active Learning?
Active learning with randomly initialized models achieves comparable results to traditional candidate-model methods, with low-confidence sampling proving most effective.
-
Uncertainty-Guided Edge Learning for Deep Image Regression in Remote Sensing
UGEL employs deep beta regression to estimate uncertainty in one forward pass, enabling faster convergence in edge learning for remote sensing image regression than active or semi-supervised baselines.
-
Neural Operator Representation of Granular Micromechanics-based Failure Envelope
A differentiable neural operator learns the mapping from granular microstructure configurations to failure envelopes, with physics-informed convexity enforcement and active learning for efficient training.
-
Labeled TrustSet Guided: Batch Active Learning with Reinforcement Learning
BRAL-T uses TrustSet-guided reinforcement learning for batch active learning and reports state-of-the-art results on 10 image classification benchmarks plus 2 fine-tuning tasks.
-
ShieldGemma: Generative AI Content Moderation Based on Gemma
ShieldGemma delivers a family of Gemma2-based classifiers that outperform Llama Guard and WildCard on public safety benchmarks while introducing a synthetic-data curation pipeline for safety tasks.