Title resolution pending

Lihong Li, Shunbao Chen, Jim Kleban, Ankur Gupta · 2015

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

Offline Contextual Bandits in the Presence of New Actions

cs.LG · 2026-05-18 · unverdicted · novelty 7.0

PONA integrates the LCPI estimator for new action selection with the DR estimator for existing actions to optimize policies in offline contextual bandits with evolving action spaces.

Off-Policy Learning with Limited Supply

cs.LG · 2026-03-19 · unverdicted · novelty 6.0 · 2 refs

OPLS is a new off-policy learning method for contextual bandits with limited supply that outperforms conventional greedy approaches by prioritizing items with relatively higher expected rewards compared to other users.

citing papers explorer

Showing 2 of 2 citing papers.

Offline Contextual Bandits in the Presence of New Actions cs.LG · 2026-05-18 · unverdicted · none · ref 16
PONA integrates the LCPI estimator for new action selection with the DR estimator for existing actions to optimize policies in offline contextual bandits with evolving action spaces.
Off-Policy Learning with Limited Supply cs.LG · 2026-03-19 · unverdicted · none · ref 16 · 2 links
OPLS is a new off-policy learning method for contextual bandits with limited supply that outperforms conventional greedy approaches by prioritizing items with relatively higher expected rewards compared to other users.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer