pith. machine review for the scientific record. sign in

arxiv: 2509.20587 · v3 · submitted 2025-09-24 · 📊 stat.ML · cs.LG· stat.ME

Recognition: unknown

Unsupervised Domain Adaptation for Binary Classification with an Unobservable Source Subpopulation

Authors on Pith no claims yet
classification 📊 stat.ML cs.LGstat.ME
keywords domainsourcesubpopulationbinarypredictionunobservableadaptationmethod
0
0 comments X
read the original abstract

We study an unsupervised domain adaptation problem where the source domain consists of subpopulations defined by the binary label $Y$ and a binary background (or environment) $A$. We focus on a challenging setting in which one such subpopulation in the source domain is unobservable. Naively ignoring this unobserved group can result in biased estimates and degraded predictive performance. Despite this structured missingness, we show that the prediction in the target domain can still be recovered. Specifically, we rigorously derive both background-specific and overall prediction models for the target domain. For practical implementation, we propose the distribution matching method to estimate the subpopulation proportions. We provide theoretical guarantees for the asymptotic behavior of our estimator, and establish an upper bound on the prediction error. Experiments on both synthetic and real-world datasets show that our method outperforms the naive benchmark that does not account for this unobservable source subpopulation.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.