pith. machine review for the scientific record. sign in

arxiv: 2605.05953 · v1 · submitted 2026-05-07 · 💻 cs.CL · cs.AI

Recognition: unknown

Hallucination as an Anomaly: Dynamic Intervention via Probabilistic Circuits

Authors on Pith no claims yet

Pith reviewed 2026-05-08 10:47 UTC · model grok-4.3

classification 💻 cs.CL cs.AI
keywords hallucination detectionprobabilistic circuitsdensity estimationlarge language modelsanomaly detectioncontrastive decodingresidual streamtruthfulness
0
0 comments X

The pith

Probabilistic circuits detect LLM hallucinations as anomalies in residual stream states.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to treat hallucinations not as inevitable errors but as identifiable deviations from factual patterns inside the model. It trains a probabilistic circuit on the residual stream activations to act as a density estimator that flags low-likelihood states at each generation step. This detection then activates a selective contrastive decoding step only for those anomalous tokens, leaving accurate generations unchanged. A sympathetic reader would care because blanket correction methods often introduce new errors into previously correct output, and a step-wise check could improve reliability without broad side effects.

Core claim

PCNET is a probabilistic circuit trained as a tractable density estimator over an LLM's residual stream. It computes exact negative log-likelihood for each hidden state to mark hallucinations as geometric anomalies away from the factual manifold. When an anomaly is detected, the companion PC-LDCD method performs contrastive decoding at that step alone, raising truthfulness metrics while preserving originally correct generations.

What carries the argument

PCNET, a probabilistic circuit acting as a density estimator on LLM residual stream activations to compute exact negative log-likelihood as an anomaly score for selective intervention.

If this is right

  • PCNET reaches AUROC values up to 99 percent for hallucination detection on CoQA, SQuAD v2.0, and TriviaQA across four different LLMs.
  • PC-LDCD raises True+Info, MC2, and MC3 scores on TruthfulQA in three of the four tested models.
  • The approach lowers mean corruption rate to 53.7 percent while keeping a 79.3 percent preservation rate for originally correct outputs.
  • Detection and intervention require no weight changes, sampling, or external verifiers at inference time.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Internal residual stream activations may encode a structured manifold of factual knowledge that density estimators can model without retraining the underlying LLM.
  • The same anomaly-detection logic could extend to other generation problems such as logical contradictions or unsafe content by retraining the circuit on appropriate labels.
  • Because the circuit computes likelihood exactly and without sampling, it could support low-overhead monitoring in production systems that already expose residual states.

Load-bearing premise

Hallucinations reliably appear as geometric anomalies on a factual manifold in the LLM residual stream that can be identified by negative log-likelihood under a tractable probabilistic circuit density estimator without sampling or external verifiers.

What would settle it

If negative log-likelihood scores from the trained probabilistic circuit show no reliable separation between verified factual and hallucinated hidden states across held-out generations on standard QA benchmarks, or if selective intervention fails to raise truthfulness scores relative to baselines, the detection and gating claim would be refuted.

Figures

Figures reproduced from arXiv: 2605.05953 by Elia Cunegatti, Erik Nielsen, Giovanni Iacca, Marcus Vukojevic.

Figure 1
Figure 1. Figure 1: PCNET detects hallucinated hidden states via exact NLL and PC￾LDCD corrects them in the discrete to￾ken space, leaving factual generations un￾touched. The example shown is based on PC-LDCD applied to Qwen3-4B. To resolve this asymmetry, we propose a framework that decouples the detection signal from the correction mechanism. Rather than editing hidden states, we use latent geometry exclusively as a diagnos… view at source ↗
Figure 2
Figure 2. Figure 2: Architecture of the proposed framework. Phase 1 (top) projects hlast through a Multi-Layer Perceptron (MLP) bottleneck into PCNET for exact NLL computation. Phase 2 (bottom) gates on NLL ≥ τ : anomalous states are detected, and the next token is selected via density-penalized lookahead, while factual states proceed through standard decoding. The example shown is based on PC-LDCD applied to Qwen3-4B. 3.1 Tr… view at source ↗
Figure 3
Figure 3. Figure 3: Illustration of the PCNET density model. (a) Factual hidden-state projections concentrate in high-density regions of the learned manifold; hallucinated projections fall into low-density outlier regions where SNLL is elevated. (b) Per-token NLL trajectory: factual generation remains stable while a hallucination triggers a sharp spike that crosses the detection threshold. generation, the projected state z is… view at source ↗
Figure 4
Figure 4. Figure 4: (a) Corruption (Red) and Preservation (Green) rates across all models and methods, averaged across LLMs with standard deviation error bars. Corruption measures the fraction of correct generations degraded by un-gated intervention; Preservation measures those protected by PCNET gating. (b) Utility-truthfulness trade-off across the four tested LLMs. Un-gated interventions (Red) trigger semantic collapse; PCN… view at source ↗
Figure 5
Figure 5. Figure 5: Results of the additional benchmark and ablations. (a)-(d) TruthfulQA MC1/MC2/MC3 and TriviaQA EM for vanilla, un-gated RAG, gated RAG, and PC-LDCD (bars indicate mean over Llama-3.2-1B and Mistral-7B; error bars indicate the corresponding std). (e), (g): PCNet detection AUROC on CoQA and TruthfulQA as a function of training-set size, while (f), (h) as a function of projection dimension (top-right, bottom-… view at source ↗
Figure 6
Figure 6. Figure 6: AUROC detection achieved by PCNET on Llama 3.2-1B and Mistral-7B LLMs and on CoQa and TruthfulQA benchmark settings across different training dataset sizes. (a) The line represents the average across LLMs and datasets, and the shadow represents the standard deviation. (b) Each line refers to a single execution. 32 64 128 256 512 Projection Dimension 0.4 0.5 0.6 0.7 0.8 0.9 1.0 AUROC (a) AUROC vs Projection… view at source ↗
Figure 7
Figure 7. Figure 7: MLP projection dimensionality ablation across view at source ↗
read the original abstract

One of the most critical challenges in Large Language Models is their tendency to hallucinate, i.e., produce factually incorrect responses. Existing approaches show promising results in terms of hallucination correction, but still suffer from a main limitation: they apply corrections indiscriminately to every token, corrupting also the originally correct generations. To overcome this drawback, we propose PCNET, a Probabilistic Circuit trained as a tractable density estimator over the LLM residual stream. The method detects hallucinations as geometric anomalies on the factual manifold, which is done via exact Negative Log-Likelihood computation, hence without the need for sampling, external verifiers, or weight modifications, as in existing techniques. To demonstrate its effectiveness, we exploit PCNET as a dynamic gate that distinguishes hallucinated from factual hidden states at each decoding step. This triggers our second main contribution, PC-LDCD (Probabilistic Circuit Latent Density Contrastive Decoding), only when the latent geometry deviates from factual regions, while leaving correct generations untouched. Across four LLMs, ranging from 1B to 8B models, and four benchmarks covering conversational reasoning, knowledge-intensive QA, reading comprehension, and truthfulness, PCNET achieves near-perfect hallucination detection across CoQA, SQuAD v2.0, and TriviaQA, with AUROC reaching up to 99%. Moreover, PC-LDCD obtains the highest True+Info, MC2, and MC3 scores on TruthfulQA in three out of four models, in comparison with state-of-the-art baselines, while reducing the mean corruption rate to 53.7% and achieving a preservation rate of 79.3%. Our proposed method is publicly available on GitHub.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The paper proposes PCNET, a probabilistic circuit trained as a tractable density estimator over LLM residual streams, to detect hallucinations as low-density geometric anomalies on a factual manifold using exact negative log-likelihood without sampling or external verifiers. It introduces PC-LDCD to apply selective latent density contrastive decoding only on detected anomalies, leaving correct generations untouched. Experiments across four LLMs (1B-8B) and benchmarks (CoQA, SQuAD v2.0, TriviaQA, TruthfulQA) report AUROC up to 99% for detection and superior True+Info/MC2/MC3 scores with mean corruption rate reduced to 53.7% and preservation rate 79.3%.

Significance. If validated, the approach would be significant for enabling dynamic, targeted hallucination mitigation in LLMs that avoids indiscriminate corruption of correct outputs. The use of exact-inference probabilistic circuits for high-dimensional density estimation on residual streams offers a computationally efficient alternative to sampling-based or verifier-dependent methods, with potential for broader anomaly detection in model internals if the manifold hypothesis generalizes.

major comments (3)
  1. [Abstract] Abstract: The training details for PCNET—including residual stream collection (layers, prompts, models), circuit architecture (depth, width, structure learning method), and validation splits or overfitting controls—are entirely absent. These are load-bearing for the central claim that NLL under the fitted PC reliably separates factual from hallucinated states, as the reported AUROC up to 99% cannot be assessed without them.
  2. [Abstract] Abstract: No ablations on PC hyperparameters or comparisons to simpler density estimators (e.g., Gaussian or KDE on identical residual-stream features) are provided. This undermines the claim that the PC-based anomaly detection is necessary or superior, as the performance could stem from the residual-stream features themselves rather than the tractable PC structure.
  3. [Abstract] Abstract: The anomaly score is defined as the negative log-likelihood of a PC whose parameters are fitted directly to the residual-stream distribution; this creates a circularity risk where detection reduces to a model-internal quantity rather than an independent external benchmark, potentially limiting generalization beyond the evaluated QA benchmarks.
minor comments (1)
  1. [Abstract] The abstract states the code is publicly available on GitHub but provides neither the repository URL nor any reproducibility artifacts (e.g., exact hyperparameters or data collection scripts).

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive comments, which highlight important areas for improving the clarity and completeness of our work. We address each major comment point by point below and indicate the revisions planned for the next version of the manuscript.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The training details for PCNET—including residual stream collection (layers, prompts, models), circuit architecture (depth, width, structure learning method), and validation splits or overfitting controls—are entirely absent. These are load-bearing for the central claim that NLL under the fitted PC reliably separates factual from hallucinated states, as the reported AUROC up to 99% cannot be assessed without them.

    Authors: We agree that the abstract does not contain these training details and that their absence makes it difficult to fully evaluate the reported AUROC results. Although the main body of the manuscript outlines the overall training approach, we will revise the abstract to include a concise summary of the residual stream collection process, PC architecture choices, and validation procedures. We will also expand the methods section to provide complete specifications for layers, prompts, models, circuit depth and width, structure learning method, data splits, and overfitting controls. This will allow readers to properly assess the central claims. revision: yes

  2. Referee: [Abstract] Abstract: No ablations on PC hyperparameters or comparisons to simpler density estimators (e.g., Gaussian or KDE on identical residual-stream features) are provided. This undermines the claim that the PC-based anomaly detection is necessary or superior, as the performance could stem from the residual-stream features themselves rather than the tractable PC structure.

    Authors: This observation is correct, and the current manuscript does not include such ablations or baseline comparisons. We will add these experiments in the revised version, including direct comparisons of the probabilistic circuit against Gaussian mixture models and kernel density estimation using the identical residual-stream features, as well as sensitivity analyses for key hyperparameters such as circuit depth and width. These additions will help isolate the contribution of the PC structure to the observed performance. revision: yes

  3. Referee: [Abstract] Abstract: The anomaly score is defined as the negative log-likelihood of a PC whose parameters are fitted directly to the residual-stream distribution; this creates a circularity risk where detection reduces to a model-internal quantity rather than an independent external benchmark, potentially limiting generalization beyond the evaluated QA benchmarks.

    Authors: We appreciate this concern regarding potential circularity. The PC is trained exclusively on residual streams from factual generations using prompts and data disjoint from the evaluation sets, thereby modeling an external factual manifold. The NLL is then applied at inference time to new residual states to detect deviations. This separation ensures the anomaly score is not derived from the same generation process being evaluated. We will add an explicit discussion of this training-inference separation and its implications for generalization in the methods section of the revised manuscript. revision: partial

Circularity Check

0 steps flagged

No significant circularity: empirical validation on external benchmarks

full rationale

The paper defines PCNET as a density estimator trained on LLM residual-stream activations and uses exact NLL as an anomaly score to flag hallucinations. Detection performance (AUROC up to 99%) and intervention results are measured against ground-truth labels from independent QA benchmarks (CoQA, SQuAD v2.0, TriviaQA, TruthfulQA). No equations or claims reduce the reported metrics to the fitted parameters by construction, no load-bearing self-citations appear, and no ansatz or uniqueness result is smuggled in. The central claims rest on external empirical evaluation rather than tautological redefinition of inputs.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 2 invented entities

The central claim rests on the learnability of a factual density in the residual stream and the separability of hallucinated states as outliers under that density; both are domain assumptions rather than derived results.

free parameters (1)
  • Probabilistic circuit parameters
    Parameters of PCNET are fitted to residual-stream activations from factual generations to serve as the density estimator.
axioms (1)
  • domain assumption The residual stream of LLMs contains a factual manifold whose density can be tractably modeled by a probabilistic circuit.
    Invoked to justify treating low NLL as evidence of hallucination.
invented entities (2)
  • PCNET no independent evidence
    purpose: Tractable density estimator over LLM residual stream for hallucination detection
    New model introduced by the paper.
  • PC-LDCD no independent evidence
    purpose: Conditional latent density contrastive decoding triggered only on detected anomalies
    New decoding procedure introduced by the paper.

pith-pipeline@v0.9.0 · 5614 in / 1336 out tokens · 67578 ms · 2026-05-08T10:47:49.973450+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

44 extracted references · 29 canonical work pages · 5 internal anchors

  1. [1]

    A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions.ACM Transactions on Information Systems, 43(2):1–55, 2025

    Lei Huang, Weijiang Yu, Weitao Ma, Weihong Zhong, Zhangyin Feng, Haotian Wang, Qiang- long Chen, Weihua Peng, Xiaocheng Feng, Bing Qin, et al. A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions.ACM Transactions on Information Systems, 43(2):1–55, 2025

  2. [2]

    The internal state of an LLM knows when it ' s lying

    Amos Azaria and Tom Mitchell. The Internal State of an LLM Knows When It’s Lying. In Houda Bouamor, Juan Pino, and Kalika Bali, editors,Findings of the Association for Computational Linguistics: EMNLP 2023, pages 967–976, Singapore, December 2023. As- sociation for Computational Linguistics. doi: 10.18653/v1/2023.findings-emnlp.68. URL https://aclantholog...

  3. [3]

    The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets

    Samuel Marks and Max Tegmark. The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets, 2024. URL https://arxiv.org/ abs/2310.06824

  4. [4]

    Inference-Time Intervention: Eliciting Truthful Answers from a Language Model.Advances in Neural Information Processing Systems, 36:41451–41530,

    Kenneth Li, Oam Patel, Fernanda Viégas, Hanspeter Pfister, and Martin Watten- berg. Inference-Time Intervention: Eliciting Truthful Answers from a Language Model.Advances in Neural Information Processing Systems, 36:41451–41530,

  5. [5]

    URL https://proceedings.neurips.cc/paper_files/paper/2023/hash/ 81b8390039b7302c909cb769f8b6cd93-Abstract-Conference.html

  6. [6]

    Locating and Editing Factual Associations in GPT.Advances in Neural Information Processing Systems, 35: 17359–17372, 2022

    Kevin Meng, David Bau, Alex Andonian, and Yonatan Belinkov. Locating and Editing Factual Associations in GPT.Advances in Neural Information Processing Systems, 35: 17359–17372, 2022. URL https://proceedings.neurips.cc/paper_files/paper/ 2022/hash/6f1d43d5a82a37e89b0665b33bf3a182-Abstract-Conference.html

  7. [7]

    TruthX: Alleviating Hallucinations by Editing Large Language Models in Truthful Space

    Shaolei Zhang, Tian Yu, and Yang Feng. TruthX: Alleviating Hallucinations by Editing Large Language Models in Truthful Space. In Lun-Wei Ku, Andre Martins, and Vivek Srikumar, editors,Proceedings of the 62nd Annual Meeting of the Association for Compu- tational Linguistics (Volume 1: Long Papers), pages 8908–8949, Bangkok, Thailand, August

  8. [8]

    T ruth X : Alleviating Hallucinations by Editing Large Language Models in Truthful Space

    Association for Computational Linguistics. doi: 10.18653/v1/2024.acl-long.483. URL https://aclanthology.org/2024.acl-long.483/

  9. [9]

    Sum-Product Networks: A New Deep Architecture

    Hoifung Poon and Pedro Domingos. Sum-Product Networks: A New Deep Architecture. InProceedings of the 27th Conference on Uncertainty in Artificial Intelligence (UAI), pages 337–346, 2011

  10. [10]

    Longllada: Unlocking long context capabilities in diffusion llms

    Pedro Zuidberg Dos Martires. Probabilistic Neural Circuits.Proceedings of the AAAI Conference on Artificial Intelligence, 38(15):17280–17289, 2024. ISSN 2374-3468. doi: 10.1609/aaai. v38i15.29675. URLhttps://ojs.aaai.org/index.php/AAAI/article/view/29675

  11. [11]

    Jacobson, Adil Wazeer, Haiyan Wang, Xinghang Zhang, and Yexiang Xue

    Daniel Xie, Maxwell J. Jacobson, Adil Wazeer, Haiyan Wang, Xinghang Zhang, and Yexiang Xue. Reducing Hallucinations in LLM-based Scientific Literature Analysis Using Peer Context Outlier Detection, 2026. URLhttps://arxiv.org/abs/2604.01461

  12. [12]

    Albert Q. Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lucile Saulnier, Lélio Renard Lavaud, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothée Lacroix, and William El Sayed. Mistral 7B, 2023. URL https://arxi...

  13. [13]

    The Llama 3 Herd of Models, 2024

    Llama Team. The Llama 3 Herd of Models, 2024. URL https://arxiv.org/abs/2407. 21783

  14. [14]

    Qwen3 Technical Report

    Qwen3 Team. Qwen3 Technical Report, 2025. URL https://arxiv.org/abs/2505.09388

  15. [15]

    Siva Reddy, Danqi Chen, and Christopher D. Manning. CoQA: A Conversational Question Answering Challenge, 2019. URLhttps://arxiv.org/abs/1808.07042. 10

  16. [16]

    T rivia QA : A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension

    Mandar Joshi, Eunsol Choi, Daniel Weld, and Luke Zettlemoyer. TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension. In Regina Barzilay and Min-Yen Kan, editors,Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1601–1611, Vancouver, Canada, July 2017. A...

  17. [17]

    Know What You Don't Know: Unanswerable Questions for SQuAD

    Pranav Rajpurkar, Robin Jia, and Percy Liang. Know What You Don’t Know: Unanswerable Questions for SQuAD, 2018. URLhttps://arxiv.org/abs/1806.03822

  18. [18]

    TruthfulQA: Measuring How Models Mimic Human Falsehoods

    Stephanie Lin, Jacob Hilton, and Owain Evans. TruthfulQA: Measuring How Models Mimic Human Falsehoods, 2022. URLhttps://arxiv.org/abs/2109.07958

  19. [19]

    Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

    Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, Sebastian Riedel, and Douwe Kiela. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors,Advances in Neural Informa...

  20. [20]

    Detecting hallucinations in large language models using semantic entropy.Nature, 630(8017):625–630, 2025

    Sebastian Farquhar, Jannik Kossen, Lorenz Kuhn, and Yarin Gal. Detecting hallucinations in large language models using semantic entropy.Nature, 630(8017):625–630, 2025. ISSN 1476-

  21. [21]

    Nature , year =

    doi: 10.1038/s41586-024-07421-0. URL https://www.nature.com/articles/ s41586-024-07421-0

  22. [22]

    Malik, and Yarin Gal

    Jannik Kossen, Jiatong Han, Muhammed Razzak, Lisa Schut, Shreshth Malik, and Yarin Gal. Semantic entropy probes: Robust and cheap hallucination detection in llms.arXiv preprint arXiv:2406.15927, 2024

  23. [23]

    Brown, J

    Xuefeng Du, Chaowei Xiao, and Yixuan Li. HaloScope: Harnessing Unlabeled LLM Generations for Hallucination Detection.Advances in Neural Informa- tion Processing Systems, 37:102948–102972, 2024. doi: 10.52202/079017-3270. URL https://proceedings.neurips.cc/paper_files/paper/2024/hash/ ba92705991cfbbcedc26e27e833ebbae-Abstract-Conference.html

  24. [24]

    Chainpoll: A high efficacy method for llm hallucination detection

    Robert Friel and Atindriyo Sanyal. Chainpoll: A high efficacy method for LLM hallucination detection, 2023. URLhttp://arxiv.org/abs/2310.18344

  25. [25]

    HaluCheck: Explainable and verifiable automation for detecting hallucinations in LLM responses.Expert Systems with Applications, 272:126712, 2025

    Sangwoo Heo, Sungwook Son, and Hyunwoo Park. HaluCheck: Explainable and verifiable automation for detecting hallucinations in LLM responses.Expert Systems with Applications, 272:126712, 2025. ISSN 0957-4174. doi: 10.1016/j.eswa.2025.126712. URL https://www. sciencedirect.com/science/article/pii/S0957417425003343

  26. [26]

    Semantics-Adaptive Activation Intervention for LLMs via Dynamic Steering Vectors, 2025

    Weixuan Wang, Jingyuan Yang, and Wei Peng. Semantics-Adaptive Activation Intervention for LLMs via Dynamic Steering Vectors, 2025. URLhttps://arxiv.org/abs/2410.12299

  27. [27]

    Tongxin Yuan, Zhiwei He, Lingzhong Dong, Yiming Wang, Ruijie Zhao, Tian Xia, Lizhen Xu, Binglin Zhou, Fangqi Li, Zhuosheng Zhang, et al

    Weixiang Zhao, Jiahe Guo, Yulin Hu, Yang Deng, An Zhang, Xingyu Sui, Xinyang Han, Yanyan Zhao, Bing Qin, Tat-Seng Chua, and Ting Liu. AdaSteer: Your Aligned LLM is Inherently an Adaptive Jailbreak Defender. In Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, and Violet Peng, editors,Proceedings of the 2025 Conference on Empirical Methods in ...

  28. [28]

    Query-Routed Activation Editing with Truth-hierarchical Preference Optimization.Proceedings of the AAAI Conference on Artificial Intelligence, 40(38):31979– 31987, 2026

    Kewei Liao, Tianbo Wang, Yuqing Ma, Zhange Zhang, Zhicheng Geng, Xiaowei Zhao, Jiakai Wang, and Xianglong Liu. Query-Routed Activation Editing with Truth-hierarchical Preference Optimization.Proceedings of the AAAI Conference on Artificial Intelligence, 40(38):31979– 31987, 2026. ISSN 2374-3468. doi: 10.1609/aaai.v40i38.40468. URL https://ojs.aaai. org/in...

  29. [29]

    Contrastive Decoding: Open-ended Text Generation as Optimization , booktitle =

    Xiang Lisa Li, Ari Holtzman, Daniel Fried, Percy Liang, Jason Eisner, Tatsunori Hashimoto, Luke Zettlemoyer, and Mike Lewis. Contrastive Decoding: Open-ended Text Generation as Optimization. In Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki, editors,Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Lon...

  30. [30]

    Dola: Decoding by contrasting layers improves factuality in large language models

    Yung-Sung Chuang, Yujia Xie, Hongyin Luo, Yoon Kim, James Glass, and Pengcheng He. DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models, 2024. URLhttps://arxiv.org/abs/2309.03883

  31. [31]

    s t r e n g t h s \

    Yue Zhang, Leyang Cui, Wei Bi, and Shuming Shi. Alleviating Hallucinations of Large Language Models through Induced Hallucinations. In Luis Chiruzzo, Alan Ritter, and Lu Wang, editors,Findings of the Association for Computational Linguistics: NAACL 2025, pages 8233–8247, Albuquerque, New Mexico, April 2025. Association for Computational Linguistics. ISBN ...

  32. [32]

    A Differential Approach to Inference in Bayesian Networks.Journal of the ACM, 50(3):280–305, 2003

    Adnan Darwiche. A Differential Approach to Inference in Bayesian Networks.Journal of the ACM, 50(3):280–305, 2003. doi: 10.1145/765568.765570

  33. [33]

    Online Structure Learning for Sum-Product Networks with Gaussian Leaves, 2017

    Wilson Hsu, Agastya Kalra, and Pascal Poupart. Online Structure Learning for Sum-Product Networks with Gaussian Leaves, 2017. URLhttp://arxiv.org/abs/1701.05265

  34. [34]

    Lossless Compression with Probabilistic Circuits, 2022

    Anji Liu, Stephan Mandt, and Guy Van den Broeck. Lossless Compression with Probabilistic Circuits, 2022. URLhttp://arxiv.org/abs/2111.11632

  35. [35]

    Neural Probabilistic Circuits: Enabling Compositional and Interpretable Predictions through Logical Reasoning, 2025

    Weixin Chen, Simon Yu, Huajie Shao, Lui Sha, and Han Zhao. Neural Probabilistic Circuits: Enabling Compositional and Interpretable Predictions through Logical Reasoning, 2025. URL http://arxiv.org/abs/2501.07021

  36. [36]

    Conversational Context Classification: A Representation Engineering Approach,

    Jonathan Pan. Conversational Context Classification: A Representation Engineering Approach,

  37. [37]

    URLhttps://arxiv.org/abs/2601.12286

  38. [38]

    Adam: A Method for Stochastic Optimization

    Diederik P. Kingma and Jimmy Ba. Adam: A Method for Stochastic Optimization, 2017. URL https://arxiv.org/abs/1412.6980

  39. [39]

    doi:10.48550/arXiv.2504.17550 , abstract =

    Yejin Bang, Ziwei Ji, Alan Schelten, Anthony Hartshorn, Tara Fowler, Cheng Zhang, Nicola Cancedda, and Pascale Fung. HalluLens: LLM Hallucination Benchmark.arXiv preprint arXiv:2504.17550, 2025. URLhttps://arxiv.org/abs/2504.17550

  40. [40]

    Probabilistic Circuits: A Unifying Framework for Tractable Probabilistic Models

    YooJung Choi, Antonio Vergari, and Guy Van den Broeck. Probabilistic Circuits: A Unifying Framework for Tractable Probabilistic Models. Technical report, UCLA, 2020. URL http: //starai.cs.ucla.edu/papers/ProbCirc20.pdf

  41. [41]

    A Composi- tional Atlas of Tractable Circuit Operations for Probabilistic Inference

    Antonio Vergari, YooJung Choi, Anji Liu, Stefano Teso, and Guy Van den Broeck. A Composi- tional Atlas of Tractable Circuit Operations for Probabilistic Inference. InAdvances in Neural In- formation Processing Systems, volume 34, 2021. URLhttps://proceedings.neurips.cc/ paper_files/paper/2021/file/6e01383fd96a17ae51cc3e15447e7533-Paper.pdf

  42. [42]

    QLoRA: Efficient Finetuning of Quantized LLMs

    Tim Dettmers, Artidoro Pagnoni, Ari Holtzman, and Luke Zettlemoyer. QLoRA: Efficient Finetuning of Quantized LLMs. In A. Oh, T. Naumann, A. Glober- son, K. Saenko, M. Hardt, and S. Levine, editors,Advances in Neural Informa- tion Processing Systems, volume 36, pages 10088–10115. Curran Associates, Inc.,

  43. [43]

    URL https://proceedings.neurips.cc/paper_files/paper/2023/file/ 1feb87871436031bdc0f2beaa62a049b-Paper-Conference.pdf

  44. [44]

    Now Publishers Inc, 2009

    Stephen Robertson and Hugo Zaragoza.The probabilistic relevance framework: BM25 and beyond, volume 4. Now Publishers Inc, 2009. 12 A Theoretical foundations We provide the theoretical grounding forPCNETandPC-LDCD. We refer the reader to [ 29] and [7] for foundational treatments of probabilistic circuits, including formal treatments of smoothness, decompos...