Ensemble distribution distillation

PMLR · 2020 · arXiv 1905.00076

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Courtroom Analogy: New Perspective on Uncertainty-Aware Classification

cs.LG · 2026-05-25 · unverdicted · novelty 6.0

Introduces courtroom analogy and MoDEX architecture to model classification uncertainty as aggregated Dirichlet opinions from class-specific advocates, claiming SOTA UQ performance and interpretability.

Multi-Teacher Knowledge Distillation via Teacher-Informed Mixture Priors

stat.ME · 2026-05-27 · unverdicted · novelty 5.0

MT-BKD applies Bayesian inference with teacher-informed mixture priors and entropy weighting to distill knowledge from multiple teachers, yielding improved accuracy and uncertainty quantification on synthetic and real tasks.

Deep Reprogramming Distillation for Medical Foundation Models

cs.CV · 2026-05-06 · unverdicted · novelty 5.0

DRD introduces a reprogramming module and CKA-based distillation to enable efficient, robust adaptation of medical foundation models to downstream 2D/3D classification and segmentation tasks, outperforming prior PEFT and KD methods on 18 tasks.

Confident in a Confidence Score: Investigating the Sensitivity of Confidence Scores to Supervised Fine-Tuning

cs.CL · 2026-04-10 · unverdicted · novelty 5.0

Supervised fine-tuning degrades the correlation between confidence scores and output quality in language models, driven by factors like training distribution similarity rather than true quality.

citing papers explorer

Showing 4 of 4 citing papers after filters.

Courtroom Analogy: New Perspective on Uncertainty-Aware Classification cs.LG · 2026-05-25 · unverdicted · none · ref 1
Introduces courtroom analogy and MoDEX architecture to model classification uncertainty as aggregated Dirichlet opinions from class-specific advocates, claiming SOTA UQ performance and interpretability.
Multi-Teacher Knowledge Distillation via Teacher-Informed Mixture Priors stat.ME · 2026-05-27 · unverdicted · none · ref 36
MT-BKD applies Bayesian inference with teacher-informed mixture priors and entropy weighting to distill knowledge from multiple teachers, yielding improved accuracy and uncertainty quantification on synthetic and real tasks.
Deep Reprogramming Distillation for Medical Foundation Models cs.CV · 2026-05-06 · unverdicted · none · ref 23
DRD introduces a reprogramming module and CKA-based distillation to enable efficient, robust adaptation of medical foundation models to downstream 2D/3D classification and segmentation tasks, outperforming prior PEFT and KD methods on 18 tasks.
Confident in a Confidence Score: Investigating the Sensitivity of Confidence Scores to Supervised Fine-Tuning cs.CL · 2026-04-10 · unverdicted · none · ref 23
Supervised fine-tuning degrades the correlation between confidence scores and output quality in language models, driven by factors like training distribution similarity rather than true quality.

Ensemble distribution distillation

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer