hub

A is B” fail to learn “B is A

Lukas Berglund, Meg Tong, Max Kaufmann, Mikita Balesni, Asa Cooper Stickland, Tomasz Korbak, Owain Evans · 2023 · arXiv 2309.12288

12 Pith papers cite this work. Polarity classification is still indexing.

12 Pith papers citing it

read on arXiv browse 12 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 3

citation-polarity summary

background 2 support 1

representative citing papers

Large Language Diffusion Models

cs.CL · 2025-02-14 · unverdicted · novelty 8.0

LLaDA is a scalable diffusion-based language model that matches autoregressive LLMs like LLaMA3 8B on tasks and surpasses GPT-4o on reversal poem completion.

Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data

cs.LG · 2024-06-06 · conditional · novelty 7.0

Absorbing discrete diffusion models the conditional distributions of clean data; reparameterizing yields a time-independent RADD that unifies with AO-ARMs and reaches SOTA perplexity among diffusion models on zero-shot language benchmarks.

Continuous Latent Diffusion Language Model

cs.CL · 2026-05-07 · unverdicted · novelty 6.0

Cola DLM proposes a hierarchical latent diffusion model that learns a text-to-latent mapping, fits a global semantic prior in continuous space with a block-causal DiT, and performs conditional decoding, establishing latent prior modeling as an alternative to token-level autoregressive language model

Trace Mutation in Human-LLM Dialogue: The Transcript as Forensic and Mitigation Surface

cs.HC · 2026-03-31 · unverdicted · novelty 6.0

Trace mutations are a class of context failures in LLM conversations consisting of utterance effacement and genitive dissociation that distort the shared record while resisting ordinary repair.

The Illusion of Latent Generalization: Bi-directionality and the Reversal Curse

cs.CL · 2026-03-13 · unverdicted · novelty 6.0

Bidirectional objectives mitigate reversal by requiring explicit source-as-target signals and storing directions as distinct representations instead of inducing latent generalization.

Diffusion-Inspired Masked Fine-Tuning for Knowledge Injection in Autoregressive LLMs

cs.CL · 2025-10-10 · unverdicted · novelty 6.0

Masked fine-tuning enables autoregressive LLMs to inject new factual knowledge without paraphrases and with reversal-curse resistance, matching diffusion LLM advantages on QA tasks.

Argumentative Large Language Models for Explainable and Contestable Claim Verification

cs.CL · 2024-05-03 · unverdicted · novelty 6.0

ArgLLMs build argumentation frameworks from LLMs to support explainable and contestable formal reasoning for claim verification.

LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code

cs.SE · 2024-03-12 · unverdicted · novelty 6.0

LiveCodeBench collects 400 recent contest problems to create a contamination-free benchmark evaluating LLMs on code generation and related capabilities like self-repair and execution.

MeMo: Memory as a Model

cs.CL · 2026-05-14 · unverdicted · novelty 5.0 · 2 refs

MeMo encodes new knowledge into a separate memory model that integrates with frozen LLMs, showing strong performance on QA benchmarks while avoiding catastrophic forgetting and working without access to model weights.

Absurd World: A Simple Yet Powerful Method to Absurdify the Real-world for Probing LLM Reasoning Capabilities

cs.AI · 2026-05-10 · unverdicted · novelty 5.0

Absurd World automatically converts real-world problems into absurd yet logically coherent scenarios to test whether LLMs can reason without depending on familiar patterns.

A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions

cs.CL · 2023-11-09 · unverdicted · novelty 5.0

The paper surveys hallucination in LLMs with an innovative taxonomy, factors, detection methods, benchmarks, mitigation strategies, and open research directions.

Measuring AI Reasoning: A Guide for Researchers

cs.AI · 2026-05-04 · unverdicted · novelty 4.0

Reasoning in language models should be measured by the faithfulness and validity of their multi-step search processes and intermediate traces, not final-answer accuracy.

citing papers explorer

Showing 12 of 12 citing papers.

Large Language Diffusion Models cs.CL · 2025-02-14 · unverdicted · none · ref 16
LLaDA is a scalable diffusion-based language model that matches autoregressive LLMs like LLaMA3 8B on tasks and surpasses GPT-4o on reversal poem completion.
Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data cs.LG · 2024-06-06 · conditional · none · ref 66
Absorbing discrete diffusion models the conditional distributions of clean data; reparameterizing yields a time-independent RADD that unifies with AO-ARMs and reaches SOTA perplexity among diffusion models on zero-shot language benchmarks.
Continuous Latent Diffusion Language Model cs.CL · 2026-05-07 · unverdicted · none · ref 7
Cola DLM proposes a hierarchical latent diffusion model that learns a text-to-latent mapping, fits a global semantic prior in continuous space with a block-causal DiT, and performs conditional decoding, establishing latent prior modeling as an alternative to token-level autoregressive language model
Trace Mutation in Human-LLM Dialogue: The Transcript as Forensic and Mitigation Surface cs.HC · 2026-03-31 · unverdicted · none · ref 2
Trace mutations are a class of context failures in LLM conversations consisting of utterance effacement and genitive dissociation that distort the shared record while resisting ordinary repair.
The Illusion of Latent Generalization: Bi-directionality and the Reversal Curse cs.CL · 2026-03-13 · unverdicted · none · ref 1
Bidirectional objectives mitigate reversal by requiring explicit source-as-target signals and storing directions as distinct representations instead of inducing latent generalization.
Diffusion-Inspired Masked Fine-Tuning for Knowledge Injection in Autoregressive LLMs cs.CL · 2025-10-10 · unverdicted · none · ref 1
Masked fine-tuning enables autoregressive LLMs to inject new factual knowledge without paraphrases and with reversal-curse resistance, matching diffusion LLM advantages on QA tasks.
Argumentative Large Language Models for Explainable and Contestable Claim Verification cs.CL · 2024-05-03 · unverdicted · none · ref 9
ArgLLMs build argumentation frameworks from LLMs to support explainable and contestable formal reasoning for claim verification.
LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code cs.SE · 2024-03-12 · unverdicted · none · ref 82
LiveCodeBench collects 400 recent contest problems to create a contamination-free benchmark evaluating LLMs on code generation and related capabilities like self-repair and execution.
MeMo: Memory as a Model cs.CL · 2026-05-14 · unverdicted · none · ref 58 · 2 links
MeMo encodes new knowledge into a separate memory model that integrates with frozen LLMs, showing strong performance on QA benchmarks while avoiding catastrophic forgetting and working without access to model weights.
Absurd World: A Simple Yet Powerful Method to Absurdify the Real-world for Probing LLM Reasoning Capabilities cs.AI · 2026-05-10 · unverdicted · none · ref 3
Absurd World automatically converts real-world problems into absurd yet logically coherent scenarios to test whether LLMs can reason without depending on familiar patterns.
A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions cs.CL · 2023-11-09 · unverdicted · none · ref 23
The paper surveys hallucination in LLMs with an innovative taxonomy, factors, detection methods, benchmarks, mitigation strategies, and open research directions.
Measuring AI Reasoning: A Guide for Researchers cs.AI · 2026-05-04 · unverdicted · none · ref 40
Reasoning in language models should be measured by the faithfulness and validity of their multi-step search processes and intermediate traces, not final-answer accuracy.

A is B” fail to learn “B is A

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer