pith. machine review for the scientific record. sign in

arxiv: 1607.04579 · v2 · submitted 2016-07-15 · 💻 cs.LG · math.OC· stat.ML

Recognition: unknown

Learning from Conditional Distributions via Dual Embeddings

Authors on Pith no claims yet
classification 💻 cs.LG math.OCstat.ML
keywords learningconditionaldistributionproblemsdistributionsonlysamplesamples
0
0 comments X
read the original abstract

Many machine learning tasks, such as learning with invariance and policy evaluation in reinforcement learning, can be characterized as problems of learning from conditional distributions. In such problems, each sample $x$ itself is associated with a conditional distribution $p(z|x)$ represented by samples $\{z_i\}_{i=1}^M$, and the goal is to learn a function $f$ that links these conditional distributions to target values $y$. These learning problems become very challenging when we only have limited samples or in the extreme case only one sample from each conditional distribution. Commonly used approaches either assume that $z$ is independent of $x$, or require an overwhelmingly large samples from each conditional distribution. To address these challenges, we propose a novel approach which employs a new min-max reformulation of the learning from conditional distribution problem. With such new reformulation, we only need to deal with the joint distribution $p(z,x)$. We also design an efficient learning algorithm, Embedding-SGD, and establish theoretical sample complexity for such problems. Finally, our numerical experiments on both synthetic and real-world datasets show that the proposed approach can significantly improve over the existing algorithms.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems

    cs.LG 2020-05 unverdicted novelty 2.0

    Offline RL promises to extract high-utility policies from static datasets but faces fundamental challenges that current methods only partially address.