arXiv preprint arXiv:2402.01920 , year=

Wu, J · 2024 · arXiv 2402.01920

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

Efficient Preference Poisoning Attack on Offline RLHF

cs.LG · 2026-05-04 · unverdicted · novelty 7.0

Preference poisoning against log-linear DPO reduces to a binary sparse approximation problem solved by lattice-reduction (BAL-A) and matching-pursuit (BMP-A) algorithms that carry recovery guarantees.

Reframing AGI Confrontation with Off Earth Autonomy

cs.CY · 2026-06-18 · unverdicted · novelty 4.0

An off-Earth autonomy pathway can reduce AGI confrontation incentives by making early cooperation preferable to power-seeking on Earth.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Efficient Preference Poisoning Attack on Offline RLHF cs.LG · 2026-05-04 · unverdicted · none · ref 43
Preference poisoning against log-linear DPO reduces to a binary sparse approximation problem solved by lattice-reduction (BAL-A) and matching-pursuit (BMP-A) algorithms that carry recovery guarantees.

arXiv preprint arXiv:2402.01920 , year=

fields

years

verdicts

representative citing papers

citing papers explorer