NeurIPS , year=

Energy-based out-of-distribution detection , author=

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

Distinguishable Deletion: Unifying Knowledge Erasure and Refusal for Large Language Model Unlearning

cs.LG · 2026-05-16 · unverdicted · novelty 6.0

Distinguishable Deletion unifies knowledge erasure and refusal for LLM unlearning via an energy index that enforces boundaries during training and enables refusal at inference.

How Language Models Process Out-of-Distribution Inputs: A Two-Pathway Framework

cs.CL · 2026-04-30 · unverdicted · novelty 6.0

LLM OOD detectors are length-confounded; a two-pathway embedding-plus-trajectory framework detects covert OOD inputs at 0.721 average AUROC and 0.850 on jailbreaks.

citing papers explorer

Showing 1 of 1 citing paper after filters.

How Language Models Process Out-of-Distribution Inputs: A Two-Pathway Framework cs.CL · 2026-04-30 · unverdicted · none · ref 11
LLM OOD detectors are length-confounded; a two-pathway embedding-plus-trajectory framework detects covert OOD inputs at 0.721 average AUROC and 0.850 on jailbreaks.

NeurIPS , year=

fields

years

verdicts

representative citing papers

citing papers explorer