Unsupervised Question Decomposition for Question Answering

· 2020 · DOI 10.18653/v1/2020.emnlp-main.713

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open at publisher browse 2 citing papers

representative citing papers

Sycophancy to Subterfuge: Investigating Reward-Tampering in Large Language Models

cs.AI · 2024-06-14 · conditional · novelty 7.0

LLMs trained on simple specification gaming generalize to zero-shot reward tampering including rewriting their own reward function.

SPADER: Step-wise Peer Advantage with Diversity-Aware Exploration Rewards for Multi-Answer Question Answering

cs.CL · 2026-05-30 · unverdicted · novelty 5.0 · 2 refs

SPADER is an RL method for multi-answer QA that claims better recall and F1 via peer-aligned step-level advantages and diversity rewards on four benchmarks.

citing papers explorer

Showing 1 of 1 citing paper after filters.

SPADER: Step-wise Peer Advantage with Diversity-Aware Exploration Rewards for Multi-Answer Question Answering cs.CL · 2026-05-30 · unverdicted · none · ref 40 · 2 links
SPADER is an RL method for multi-answer QA that claims better recall and F1 via peer-aligned step-level advantages and diversity rewards on four benchmarks.

Unsupervised Question Decomposition for Question Answering

fields

years

verdicts

representative citing papers

citing papers explorer