arxiv: 2309.05922 · v1 · submitted 2023-09-12 · 💻 cs.AI · cs.CL· cs.IR

Recognition: 2 theorem links

· Lean Theorem

A Survey of Hallucination in Large Foundation Models

Vipula Rawte , Amit Sheth , Amitava Das

Authors on Pith no claims yet

Pith reviewed 2026-05-16 15:16 UTC · model grok-4.3

classification 💻 cs.AI cs.CLcs.IR

keywords hallucinationlarge foundation modelssurveyevaluation criteriamitigation strategiesAI reliabilityfabricated content

0 comments

The pith

Hallucination in large foundation models falls into specific types that support targeted evaluation criteria and mitigation strategies.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper surveys hallucination in large foundation models, defined as generating content that deviates from factual reality or fabricates information. It classifies hallucination phenomena unique to these models and sets up evaluation criteria to measure their extent. The survey reviews existing mitigation approaches and outlines future research directions. A sympathetic reader would care because reliable AI outputs depend on understanding and reducing such deviations. The work positions the classification as a foundation for better handling of the issue in practice.

Core claim

The paper establishes that hallucination in large foundation models can be systematically classified into distinct phenomena, with corresponding evaluation criteria and mitigation strategies reviewed across the literature, providing a structured overview that clarifies challenges and points toward future solutions.

What carries the argument

The classification of hallucination types specific to large foundation models, which organizes phenomena to enable evaluation and mitigation.

If this is right

Evaluation benchmarks can be built around the identified hallucination types for more precise measurement.
Mitigation techniques can be developed or refined to target specific categories of hallucination.
Future model training and alignment efforts gain structured guidance from the reviewed strategies.
Deployment decisions for large foundation models can incorporate type-based risk assessments.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The classification framework may extend to new model architectures by testing whether emerging behaviors fit existing categories.
Connections to broader AI safety could arise if hallucination types are linked to specific failure modes in real-world use.
Empirical studies could validate the survey by applying the criteria to outputs from multiple foundation models and measuring coverage.

Load-bearing premise

The reviewed literature is representative and the proposed classification captures the full range of hallucination phenomena in large foundation models.

What would settle it

Discovery of a consistent hallucination behavior in a deployed large foundation model that cannot be placed into any of the survey's defined categories would undermine the classification's completeness.

read the original abstract

Hallucination in a foundation model (FM) refers to the generation of content that strays from factual reality or includes fabricated information. This survey paper provides an extensive overview of recent efforts that aim to identify, elucidate, and tackle the problem of hallucination, with a particular focus on ``Large'' Foundation Models (LFMs). The paper classifies various types of hallucination phenomena that are specific to LFMs and establishes evaluation criteria for assessing the extent of hallucination. It also examines existing strategies for mitigating hallucination in LFMs and discusses potential directions for future research in this area. Essentially, the paper offers a comprehensive examination of the challenges and solutions related to hallucination in LFMs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper is a survey on hallucination in Large Foundation Models (LFMs). It defines hallucination as generation of content straying from factual reality, classifies hallucination phenomena specific to LFMs, establishes evaluation criteria, examines mitigation strategies, and discusses future research directions, claiming to provide a comprehensive examination of challenges and solutions.

Significance. If the taxonomy and literature coverage hold, the survey would organize a rapidly growing area, helping researchers navigate types of hallucinations, benchmarks, and mitigation techniques in LFMs. It could serve as a reference point for standardizing evaluation and highlighting open problems, provided the selection of works is representative.

major comments (1)

[Abstract / Literature Review Approach] The manuscript provides no explicit search protocol, databases searched, date range, keywords, or inclusion/exclusion criteria (see Abstract and the opening of the literature review section). This is load-bearing for the central claim of a 'comprehensive examination,' as it leaves open the possibility that recent work on multimodal, vision-language, or agentic models is underrepresented and that the proposed taxonomy misses boundary phenomena.

minor comments (2)

[Taxonomy section] Clarify whether the taxonomy is intended to be exhaustive or illustrative; add a short discussion of how new hallucination types (e.g., sycophancy induced by RLHF) would be accommodated.
[Mitigation Strategies] Ensure figure captions and tables listing mitigation methods include the publication year of each cited work to aid readers in tracking recency.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the single major comment below and will revise the manuscript to strengthen the literature review methodology.

read point-by-point responses

Referee: The manuscript provides no explicit search protocol, databases searched, date range, keywords, or inclusion/exclusion criteria (see Abstract and the opening of the literature review section). This is load-bearing for the central claim of a 'comprehensive examination,' as it leaves open the possibility that recent work on multimodal, vision-language, or agentic models is underrepresented and that the proposed taxonomy misses boundary phenomena.

Authors: We agree that an explicit search protocol is necessary to support the claim of comprehensiveness. In the revised manuscript we will insert a new subsection 'Literature Search and Selection Methodology' immediately after the abstract. It will document: databases (arXiv, Google Scholar, ACL Anthology, NeurIPS/ICLR/CVPR proceedings), date range (January 2018–September 2023), keyword combinations (e.g., 'hallucination' AND ('large language model' OR 'foundation model' OR 'vision-language model')), and inclusion criteria (empirical studies on models with ≥1B parameters that directly measure or mitigate hallucination). We will also perform an additional targeted search for recent multimodal and agentic work and add relevant citations to ensure boundary cases are covered; the taxonomy in Section 3 is intentionally extensible and already references vision-language phenomena, but we will make this coverage explicit. revision: yes

Circularity Check

0 steps flagged

No circularity: survey synthesis with no derivations or self-referential reductions

full rationale

The paper is a literature survey that classifies hallucination types in LFMs, reviews evaluation criteria and mitigation strategies, and outlines future directions. No equations, fitted parameters, predictions, or uniqueness theorems appear. Central claims rest on synthesis of external literature rather than any step that reduces by construction to the paper's own inputs or self-citations. Self-citations, if present, are not load-bearing for any derived result.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

As a survey the paper introduces no free parameters, axioms, or invented entities; it rests entirely on the body of previously published work it cites.

pith-pipeline@v0.9.0 · 5410 in / 989 out tokens · 37869 ms · 2026-05-16T15:16:29.922405+00:00 · methodology

discussion (0)

Forward citations

Cited by 17 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

3D-VCD: Hallucination Mitigation in 3D-LLM Embodied Agents through Visual Contrastive Decoding
cs.CV 2026-04 unverdicted novelty 8.0

3D-VCD reduces hallucinations in 3D-LLM embodied agents by contrasting predictions from original and distorted 3D scene representations at inference time.
Evaluating Patient Safety Risks in Generative AI: Development and Validation of a FMECA Framework for Generated Clinical Content
cs.CY 2026-04 unverdicted novelty 7.0

A novel FMECA-based framework was developed and validated for systematic assessment of patient safety risks in LLM-generated clinical discharge summaries, demonstrating moderate-to-substantial inter-rater agreement an...
GraphScout: Empowering Large Language Models with Intrinsic Exploration Ability for Agentic Graph Reasoning
cs.AI 2026-03 unverdicted novelty 7.0

GraphScout trains LLMs to autonomously synthesize structured training data from knowledge graphs via flexible exploration tools, enabling a 4B model to outperform larger LLMs by 16.7% on average with fewer inference t...
Dimension-Level Intent Fidelity Evaluation for Large Language Models: Evidence from Structured Prompt Ablation
cs.CL 2026-05 unverdicted novelty 6.0

Dimension-level evaluation reveals that 25-58% of LLM outputs with perfect holistic scores still show measurable intent deficits across languages and domains.
Estimating the Black-box LLM Uncertainty with Distribution-Aligned Adversarial Distillation
cs.CL 2026-05 unverdicted novelty 6.0

DisAAD trains a 1%-sized proxy model via adversarial distillation to quantify uncertainty in black-box LLMs by aligning with their output distributions.
Chain of Evidence: Pixel-Level Visual Attribution for Iterative Retrieval-Augmented Generation
cs.CV 2026-05 unverdicted novelty 6.0

CoE applies vision-language models directly to document screenshots to deliver pixel-level bounding-box attribution for evidence in iterative retrieval-augmented generation, outperforming text baselines on visual-layo...
Online Self-Calibration Against Hallucination in Vision-Language Models
cs.CV 2026-05 unverdicted novelty 6.0

OSCAR exploits the generative-discriminative gap in LVLMs to build online preference data with MCTS and dual-granularity rewards for DPO-based calibration, claiming SOTA hallucination reduction and improved multimodal...
Mind the Unseen Mass: Unmasking LLM Hallucinations via Soft-Hybrid Alphabet Estimation
cs.CL 2026-04 unverdicted novelty 6.0

SHADE adaptively combines coverage and spectral signals to estimate semantic alphabet size from few LLM samples, yielding better performance than baselines in low-sample regimes for alphabet estimation and QA error detection.
SinkTrack: Attention Sink based Context Anchoring for Large Language Models
cs.CV 2026-04 unverdicted novelty 6.0

SinkTrack uses attention sink at the BOS token to anchor LLMs to initial context, reducing hallucination and forgetting with reported gains on benchmarks like SQuAD2.0 and M3CoT.
Do Hallucination Neurons Generalize? Evidence from Cross-Domain Transfer in LLMs
cs.CL 2026-03 unverdicted novelty 6.0

Hallucination neurons in LLMs are domain-specific, with cross-domain classifiers dropping from AUROC 0.783 within-domain to 0.563 across domains.
The Readability Spectrum: Patterns, Issues, and Prompt Effects in LLM-Generated Code
cs.SE 2026-05 unverdicted novelty 5.0

LLM-generated code matches human-written code in overall readability but exhibits different issue patterns, and prompt engineering has limited impact on improving it.
Beyond Accuracy: LLM Variability in Evidence Screening for Software Engineering SLRs
cs.SE 2026-04 unverdicted novelty 5.0

LLMs exhibit substantial heterogeneity and non-determinism in SLR evidence screening, abstracts are decisive for performance, and they show no reliable superiority over classical classifiers on two real SLRs.
A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions
cs.CL 2023-11 unverdicted novelty 5.0

The paper surveys hallucination in LLMs with an innovative taxonomy, factors, detection methods, benchmarks, mitigation strategies, and open research directions.
Trust but Verify: Introducing DAVinCI -- A Framework for Dual Attribution and Verification in Claim Inference for Language Models
cs.AI 2026-04 unverdicted novelty 4.0

DAVinCI combines claim attribution to model internals and external sources with entailment-based verification to improve LLM factual reliability by 5-20% on fact-checking datasets.
A Systematic Study of Retrieval Pipeline Design for Retrieval-Augmented Medical Question Answering
cs.CL 2026-04 unverdicted novelty 4.0

Dense retrieval plus query reformulation and reranking reaches 60.49% accuracy on MedQA USMLE, outperforming other setups while domain-specialized models make better use of the retrieved evidence.
A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models
cs.CL 2024-01 accept novelty 4.0

A survey that compiles and taxonomizes more than 32 existing hallucination mitigation techniques for LLMs while analyzing their challenges and limitations.
A Survey on the Memory Mechanism of Large Language Model based Agents
cs.AI 2024-04 accept novelty 3.0

A systematic review of memory designs, evaluation methods, applications, limitations, and future directions for LLM-based agents.

Reference graph

Works this paper leans on

127 extracted references · 127 canonical work pages · cited by 17 Pith papers · 20 internal anchors

[1]

Kyle Wiggers , title=

work page
[4]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Putting people in their place: Affordance-aware human insertion into scenes , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

work page
[10]

The American Surgeon , volume=

Severe sepsis attributable to community-associated methicillin-resistant Staphylococcus aureus: an emerging fatal problem , author=. The American Surgeon , volume=. 2007 , publisher=

work page 2007
[11]

Computer Vision--ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13 , pages=

Microsoft coco: Common objects in context , author=. Computer Vision--ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13 , pages=. 2014 , organization=

work page 2014
[12]

Proceedings of the IEEE international conference on computer vision , pages=

Dense-captioning events in videos , author=. Proceedings of the IEEE international conference on computer vision , pages=

work page
[14]

arXiv preprint arXiv:2308.10168 , year=

Head-to-Tail: How Knowledgeable are Large Language Models (LLM)? AKA Will LLMs Replace Knowledge Graphs? , author=. arXiv preprint arXiv:2308.10168 , year=

work page arXiv
[15]

Workshop on Efficient Systems for Foundation Models@ ICML2023 , year=

Audio-Journey: Efficient Visual+ LLM-aided Audio Encodec Diffusion , author=. Workshop on Efficient Systems for Foundation Models@ ICML2023 , year=

work page
[16]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

Rarr: Researching and revising what language models say, using language models , author=. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

work page
[21]

HuggingFace\_InferenceAPI , title=

work page
[22]

Fine-Tuning Language Models from Human Preferences

Daniel M. Ziegler and Nisan Stiennon and Jeffrey Wu and Tom B. Brown and Alec Radford and Dario Amodei and Paul F. Christiano and Geoffrey Irving , title =. CoRR , volume =. 2019 , url =. 1909.08593 , timestamp =

work page internal anchor Pith review Pith/arXiv arXiv 2019
[23]

Wikipedia\_Krippendorff's\_Alpha , title=

work page
[24]

Wikipedia\_Fleiss's\_Kappa , title=

work page
[25]

Wikipedia\_zscore , title=

work page
[26]

Wikipedia\_min\_max , title=

work page
[27]

The SNLI corpus , author=

work page
[28]

Google Search , title=

work page
[29]

Rishi Bommasani and Kevin Klyman and Daniel Zhang and Percy Liang , title =

work page
[30]

2020 , eprint=

HuggingFace's Transformers: State-of-the-art Natural Language Processing , author=. 2020 , eprint=

work page 2020
[31]

databricks , year = "2023", title=

work page 2023
[32]

On Faithfulness and Factuality in Abstractive Summarization

Maynez, Joshua and Narayan, Shashi and Bohnet, Bernd and McDonald, Ryan. On Faithfulness and Factuality in Abstractive Summarization. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020. doi:10.18653/v1/2020.acl-main.173

work page doi:10.18653/v1/2020.acl-main.173 2020
[33]

The Curious Case of Hallucinations in Neural Machine Translation

Raunak, Vikas and Menezes, Arul and Junczys-Dowmunt, Marcin. The Curious Case of Hallucinations in Neural Machine Translation. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2021. doi:10.18653/v1/2021.naacl-main.92

work page doi:10.18653/v1/2021.naacl-main.92 2021
[34]

ArXiv , year=

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter , author=. ArXiv , year=

work page
[35]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

High-resolution image synthesis with latent diffusion models , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

work page
[36]

2020 , eprint=

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , author=. 2020 , eprint=

work page 2020
[37]

A large annotated corpus for learning natural language inference

A large annotated corpus for learning natural language inference , author=. arXiv preprint arXiv:1508.05326 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[38]

2022 , month=

OpenAI , title=. 2022 , month=

work page 2022
[39]

Politifact , publisher=

Politifact , title=. Politifact , publisher=

work page
[40]

The New York Times , publisher=

NYT , title=. The New York Times , publisher=

work page
[41]

Fast Transformer Decoding: One Write-Head is All You Need

Fast transformer decoding: One write-head is all you need , author=. arXiv preprint arXiv:1911.02150 , year=

work page internal anchor Pith review Pith/arXiv arXiv 1911
[42]

Learning Whom to Trust with MACE

Hovy, Dirk and Berg-Kirkpatrick, Taylor and Vaswani, Ashish and Hovy, Eduard. Learning Whom to Trust with MACE. Proceedings of the 2013 Conference of the North A merican Chapter of the Association for Computational Linguistics: Human Language Technologies. 2013

work page 2013
[43]

2023 , eprint=

Do Language Models Know When They're Hallucinating References? , author=. 2023 , eprint=

work page 2023
[44]

Copyright Registration Guidance: Works Containing Material Generated by Artificial Intelligence

Copyright-Office , journal=. Copyright Registration Guidance: Works Containing Material Generated by Artificial Intelligence. 2023 , month=

work page 2023
[45]

2023 , month=

Blueprint for an AI Bill of Rights: Making Automated Systems Work For the American People , author=. 2023 , month=

work page 2023
[46]

2023 , month=

GPTZero , author=. 2023 , month=

work page 2023
[47]

2023 , month=

Proposal for a REGULATION OF THE EUROPEAN PARLIAMENT AND OF THE COUNCIL LAYING DOWN HARMONISED RULES ON ARTIFICIAL INTELLIGENCE (ARTIFICIAL INTELLIGENCE ACT) AND AMENDING CERTAIN UNION LEGISLATIVE ACTS , author=. 2023 , month=

work page 2023
[48]

2023 , url=

Our approach to AI safety , author=. 2023 , url=

work page 2023
[49]

2023 , eprint=

HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models , author=. 2023 , eprint=

work page 2023
[50]

2023 , eprint=

Self-contradictory Hallucinations of Large Language Models: Evaluation, Detection and Mitigation , author=. 2023 , eprint=

work page 2023
[51]

2023 , eprint=

mmT5: Modular Multilingual Pre-Training Solves Source Language Hallucinations , author=. 2023 , eprint=

work page 2023
[52]

2023 , eprint=

PURR: Efficiently Editing Language Model Hallucinations by Denoising Language Model Corruptions , author=. 2023 , eprint=

work page 2023
[53]

2023 , eprint=

Mitigating Language Model Hallucination with Interactive Question-Knowledge Alignment , author=. 2023 , eprint=

work page 2023
[54]

2023 , eprint=

How Language Model Hallucinations Can Snowball , author=. 2023 , eprint=

work page 2023
[55]

When Do Pre-Training Biases Propagate to Downstream Tasks? A Case Study in Text Summarization

Ladhak, Faisal and Durmus, Esin and Suzgun, Mirac and Zhang, Tianyi and Jurafsky, Dan and McKeown, Kathleen and Hashimoto, Tatsunori. When Do Pre-Training Biases Propagate to Downstream Tasks? A Case Study in Text Summarization. Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics. 2023

work page 2023
[60]

2023 IEEE International Conference on Assured Autonomy (ICAA) , pages=

Dehallucinating Large Language Models Using Formal Methods Guided Iterative Prompting , author=. 2023 IEEE International Conference on Assured Autonomy (ICAA) , pages=. 2023 , organization=

work page 2023
[67]

ACM Computing Surveys , volume=

Survey of hallucination in natural language generation , author=. ACM Computing Surveys , volume=. 2023 , publisher=

work page 2023
[68]

arXiv preprint arXiv:2306.08302 , year=

Unifying Large Language Models and Knowledge Graphs: A Roadmap , author=. arXiv preprint arXiv:2306.08302 , year=

work page arXiv
[69]

arXiv preprint arXiv:2308.06374 , year=

Large Language Models and Knowledge Graphs: Opportunities and Challenges , author=. arXiv preprint arXiv:2308.06374 , year=

work page arXiv
[71]

2023 , eprint=

SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models , author=. 2023 , eprint=

work page 2023
[72]

YouTube , publisher=

The future of computational linguistics , url=. YouTube , publisher=. 2023 , month=

work page 2023
[73]

2022 , eprint=

RoFormer: Enhanced Transformer with Rotary Position Embedding , author=. 2022 , eprint=

work page 2022
[74]

2019 , eprint=

Root Mean Square Layer Normalization , author=. 2019 , eprint=

work page 2019
[75]

2019 , eprint=

Deep Learning using Rectified Linear Units (ReLU) , author=. 2019 , eprint=

work page 2019
[76]

2020 , eprint=

GLU Variants Improve Transformer , author=. 2020 , eprint=

work page 2020
[77]

2019 , eprint=

Fast Transformer Decoding: One Write-Head is All You Need , author=. 2019 , eprint=

work page 2019
[78]

Smith and Mike Lewis , title =

Ofir Press and Noah A. Smith and Mike Lewis , title =. The Tenth International Conference on Learning Representations,. 2022 , url =

work page 2022
[79]

Advances in Neural Information Processing Systems , volume=

Flashattention: Fast and memory-efficient exact attention with io-awareness , author=. Advances in Neural Information Processing Systems , volume=

work page
[80]

Self-Instruct: Aligning Language Models with Self-Generated Instructions

Self-Instruct: Aligning Language Model with Self Generated Instructions , author=. arXiv preprint arXiv:2212.10560 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[81]

Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation

Is your code generated by chatgpt really correct? rigorous evaluation of large language models for code generation , author=. arXiv preprint arXiv:2305.01210 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[82]

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Bloom: A 176b-parameter open-access multilingual language model , author=. arXiv preprint arXiv:2211.05100 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[83]

The Eleventh International Conference on Learning Representations , year=

Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning , author=. The Eleventh International Conference on Learning Representations , year=

work page
[84]

2022 , eprint=

OPT: Open Pre-trained Transformer Language Models , author=. 2022 , eprint=

work page 2022
[85]

LLaMA: Open and Efficient Foundation Language Models

LLaMA: Open and Efficient Foundation Language Models , author=. arXiv preprint arXiv:2302.13971 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[86]

Hashimoto , title =

Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto , title =. GitHub repository , howpublished =. 2023 , publisher =

work page 2023
[87]

and Stoica, Ion and Xing, Eric P

Chiang, Wei-Lin and Li, Zhuohan and Lin, Zi and Sheng, Ying and Wu, Zhanghao and Zhang, Hao and Zheng, Lianmin and Zhuang, Siyuan and Zhuang, Yonghao and Gonzalez, Joseph E. and Stoica, Ion and Xing, Eric P. , month =. Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90\ url =

work page
[88]

Continuous-Time Meta-Learning with Forward Mode Differentiation , booktitle =

Tristan Deleu and David Kanaa and Leo Feng and Giancarlo Kerg and Yoshua Bengio and Guillaume Lajoie and Pierre. Continuous-Time Meta-Learning with Forward Mode Differentiation , booktitle =. 2022 , url =

work page 2022
[89]

2022 , eprint=

Improved Beam Search for Hallucination Mitigation in Abstractive Summarization , author=. 2022 , eprint=

work page 2022
[90]

2022 , eprint=

Detecting and Mitigating Hallucinations in Machine Translation: Model Internal Workings Alone Do Well, Sentence Similarity Even Better , author=. 2022 , eprint=

work page 2022
[91]

2023 , eprint=

TinyStories: How Small Can Language Models Be and Still Speak Coherent English? , author=. 2023 , eprint=

work page 2023
[92]

2022 , url=

Midjourney , title=. 2022 , url=

work page 2022
[93]

2023 , url=

nVIDIA , title=. 2023 , url=

work page 2023
[94]

2023 , url=

Reuters , title=. 2023 , url=

work page 2023
[95]

International Conference on Machine Learning , pages=

Zero-shot text-to-image generation , author=. International Conference on Machine Learning , pages=. 2021 , organization=

work page 2021
[96]

Hierarchical Text-Conditional Image Generation with CLIP Latents

Hierarchical text-conditional image generation with clip latents , author=. arXiv preprint arXiv:2204.06125 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[97]

2023 , eprint=

GPT-4 Technical Report , author=. 2023 , eprint=

work page 2023
[98]

Advances in neural information processing systems , volume=

Language models are few-shot learners , author=. Advances in neural information processing systems , volume=. 2020 , url =

work page 2020
[99]

2023 , month=

Pause Giant AI Experiments: An Open Letter , author=. 2023 , month=

work page 2023
[100]

Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics , pages=

When Do Pre-Training Biases Propagate to Downstream Tasks? A Case Study in Text Summarization , author=. Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics , pages=. 2023 , url=

work page 2023
[101]

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

Hallucinated but factual! inspecting the factuality of hallucinations in abstractive summarization , author=. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=. 2022 , url=

work page 2022
[102]

Advances in neural information processing systems , volume=

Attention is all you need , author=. Advances in neural information processing systems , volume=. 2017 , url=

work page 2017
[103]

2018 , publisher=

Improving language understanding by generative pre-training , author=. 2018 , publisher=

work page 2018

Showing first 80 references.