Pith Number

pith:6YETJUN5

pith:2023:6YETJUN5ELIXCCUFJKB6DORLLZ

not attested not anchored not stored refs resolved

Aligning Large Multimodal Models with Factually Augmented RLHF

Chuang Gan, Chunyuan Li, Haotian Liu, Kurt Keutzer, Liang-Yan Gui, Shengcao Cao, Sheng Shen, Trevor Darrell, Yikang Shen, Yiming Yang, Yu-Xiong Wang, Zhiqing Sun

Factually augmented RLHF aligns large multimodal models to cut hallucinations and reach 94 percent of GPT-4 performance.

arxiv:2309.14525 v1 · 2023-09-25 · cs.CV · cs.CL

Open paper page JSON Open Graph Bundle Merged state Verified badge What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{6YETJUN5ELIXCCUFJKB6DORLLZ}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

As the first LMM trained with RLHF, our approach achieves remarkable improvement on the LLaVA-Bench dataset with the 94% performance level of the text-only GPT-4 (while previous best methods can only achieve the 87% level), and an improvement by 60% on MMHAL-BENCH over other baselines.

C2weakest assumption

That augmenting the reward model with image captions and ground-truth options reliably prevents reward hacking without introducing new biases or reducing generalization on open-ended questions.

C3one line summary

Factually Augmented RLHF aligns large multimodal models to reduce hallucinations, reaching 94% of GPT-4 on LLaVA-Bench and 60% improvement on the new MMHAL-BENCH.

References

40 extracted · 40 resolved · 27 Pith anchors

[1] PaLM 2 Technical Report · arXiv:2305.10403

[2] OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models · arXiv:2308.01390

[3] Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond · arXiv:2308.12966

[4] Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback · arXiv:2204.05862

[5] Language models are few-shot learners 1901

Formal links

2 machine-checked theorem links

Cited by

39 papers in Pith

DepthAgent: Towards Better Universal Depth Estimation via Sample-wise Expert Selection

ErrorRadar: Benchmarking Complex Mathematical Reasoning of Multimodal Large Language Models Via Error Detection

A Survey on LLM-as-a-Judge

Small Generalizable Prompt Predictive Models Can Steer Efficient RL Post-Training of Large Reasoning Models

Deeper Thought, Weaker Aim: Understanding and Mitigating Perceptual Impairment during Reasoning in Multimodal Large Language Models

Receipt and verification

First computed	2026-05-17T23:38:50.660329Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

f60934d1bd22d1710a854a83e1ba2b5e6737dc1ce190bd9a4bcce587715752dd

Aliases

arxiv: 2309.14525 · arxiv_version: 2309.14525v1 · doi: 10.48550/arxiv.2309.14525 · pith_short_12: 6YETJUN5ELIX · pith_short_16: 6YETJUN5ELIXCCUF · pith_short_8: 6YETJUN5

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/6YETJUN5ELIXCCUFJKB6DORLLZ \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: f60934d1bd22d1710a854a83e1ba2b5e6737dc1ce190bd9a4bcce587715752dd

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "4a31ccd234c5dd37a44709d78049d1b291502e92a679cc0c02c73eb12bf35fdf",
    "cross_cats_sorted": [
      "cs.CL"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.CV",
    "submitted_at": "2023-09-25T20:59:33Z",
    "title_canon_sha256": "b87d6710b9a70b3477c1ded6d7d8d8fa6c0ab18f6bcb0fae4771d23b5540c209"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2309.14525",
    "kind": "arxiv",
    "version": 1
  }
}