pith. sign in

Title resolution pending

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

fields

cs.LG 2

years

2026 2

verdicts

UNVERDICTED 2

representative citing papers

Offline Contextual Bandits in the Presence of New Actions

cs.LG · 2026-05-18 · unverdicted · novelty 7.0

PONA integrates the LCPI estimator for new action selection with the DR estimator for existing actions to optimize policies in offline contextual bandits with evolving action spaces.

Off-Policy Learning with Limited Supply

cs.LG · 2026-03-19 · unverdicted · novelty 6.0 · 2 refs

OPLS is a new off-policy learning method for contextual bandits with limited supply that outperforms conventional greedy approaches by prioritizing items with relatively higher expected rewards compared to other users.

citing papers explorer

Showing 2 of 2 citing papers.

  • Offline Contextual Bandits in the Presence of New Actions cs.LG · 2026-05-18 · unverdicted · none · ref 38

    PONA integrates the LCPI estimator for new action selection with the DR estimator for existing actions to optimize policies in offline contextual bandits with evolving action spaces.

  • Off-Policy Learning with Limited Supply cs.LG · 2026-03-19 · unverdicted · none · ref 31 · 2 links

    OPLS is a new off-policy learning method for contextual bandits with limited supply that outperforms conventional greedy approaches by prioritizing items with relatively higher expected rewards compared to other users.