Pith Number

pith:LC2EMAER

pith:2024:LC2EMAER6QLZIBSPQPF7M7YUSU

not attested not anchored not stored refs resolved

LogicVista: Multimodal LLM Logical Reasoning Benchmark in Visual Contexts

Edward Sun, Tianyu Liu, Wei Wang, Yijia Xiao

LogicVista provides a benchmark of 448 visual questions to evaluate logical reasoning in multimodal LLMs across five tasks and nine capabilities.

arxiv:2407.04973 v1 · 2024-07-06 · cs.AI · cs.CL · cs.CV · cs.LG

Open paper page JSON Open Graph Bundle Merged state Verified badge What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{LC2EMAER6QLZIBSPQPF7M7YUSU}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

LogicVista assesses the integrated logical reasoning capabilities of MLLMs in visual contexts across 5 logical reasoning tasks encompassing 9 different capabilities using a sample of 448 multiple-choice questions.

C2weakest assumption

The 448 questions and their human-written reasoning annotations accurately and comprehensively capture general logical cognition abilities in visual contexts without significant selection bias or coverage gaps.

C3one line summary

LogicVista is a new benchmark dataset with 448 visual logic questions that evaluates multimodal LLMs on five reasoning tasks covering nine capabilities.

References

58 extracted · 58 resolved · 0 Pith anchors

[1] OpenAI, Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, Red Avila, Igor Babuschkin, Suchir 2024

[2] Flamingo: a visual language model for few-shot learning, 2022 2022

[4] Minigpt-4: Enhancing vision-language understanding with advanced large language models, 2023 2023

[5] A survey on multimodal large language models, 2023 2023

[6] Mme: A comprehensive evaluation benchmark for multimodal large language models, 2023 2023

Formal links

2 machine-checked theorem links

Cited by

36 papers in Pith

Visually-Guided Policy Optimization for Multimodal Reasoning

Advancing AI Research Assistants with Expert-Involved Learning

VRPRM: Process Reward Modeling via Visual Reasoning

MapTab: Are MLLMs Ready for Multi-Criteria Route Planning in Heterogeneous Graphs?

Reversing the Flow: Generation-to-Understanding Synergy in Large Multimodal Models

Receipt and verification

First computed	2026-05-17T23:38:49.330835Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

58b4460091f41794064f83cbf67f14950cd0c7e06fadcd306614670051539af6

Aliases

arxiv: 2407.04973 · arxiv_version: 2407.04973v1 · doi: 10.48550/arxiv.2407.04973 · pith_short_12: LC2EMAER6QLZ · pith_short_16: LC2EMAER6QLZIBSP · pith_short_8: LC2EMAER

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/LC2EMAER6QLZIBSPQPF7M7YUSU \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 58b4460091f41794064f83cbf67f14950cd0c7e06fadcd306614670051539af6

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "0ab50ae4f366a54860372ccf4025a2176a88d4c211d7be483104ed2a2994079d",
    "cross_cats_sorted": [
      "cs.CL",
      "cs.CV",
      "cs.LG"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.AI",
    "submitted_at": "2024-07-06T06:48:16Z",
    "title_canon_sha256": "ea00387c1c41a9eac9493f08e683880dd321e728c00c721a755ec6efd2b5e40c"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2407.04973",
    "kind": "arxiv",
    "version": 1
  }
}