Title resolution pending

Yu Yuan, Lili Zhao, Kai Zhang, Guangting Zheng, Qi Liu · 2024 · DOI 10.18653/v1/2024.emnlp-main.679

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open at publisher browse 2 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

Calibration vs Decision Making: Revisiting the Reliability Paradox in Unlearned Language Models

cs.CL · 2026-05-20 · unverdicted · novelty 5.0

Unlearned language models retain low calibration error but show increased shortcut reliance on the TOFU benchmark, extending the reliability paradox to machine unlearning.

Measuring AI Reasoning: A Guide for Researchers

cs.AI · 2026-05-04 · unverdicted · novelty 4.0

Reasoning in language models should be measured by the faithfulness and validity of their multi-step search processes and intermediate traces, not final-answer accuracy.

citing papers explorer

Showing 2 of 2 citing papers.

Calibration vs Decision Making: Revisiting the Reliability Paradox in Unlearned Language Models cs.CL · 2026-05-20 · unverdicted · none · ref 68
Unlearned language models retain low calibration error but show increased shortcut reliance on the TOFU benchmark, extending the reliability paradox to machine unlearning.
Measuring AI Reasoning: A Guide for Researchers cs.AI · 2026-05-04 · unverdicted · none · ref 77
Reasoning in language models should be measured by the faithfulness and validity of their multi-step search processes and intermediate traces, not final-answer accuracy.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer