Turning large language models into cognitive models

Marcel Binz, Eric Schulz · 2023 · arXiv 2306.03917

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

ErrorRadar: Benchmarking Complex Mathematical Reasoning of Multimodal Large Language Models Via Error Detection

cs.CL · 2024-10-06 · unverdicted · novelty 8.0

ErrorRadar is a new benchmark of 2,500 multimodal K-12 math problems for MLLM error step identification and categorization, where GPT-4o trails human experts by ~10%.

Simulating Word Suggestion Usage in Mobile Typing to Guide Intelligent Text Entry Design

cs.HC · 2026-02-06 · unverdicted · novelty 7.0

WSTypist is a new RL-based simulation model that reproduces human-like word suggestion strategies, individual differences, and adaptation to design changes in mobile text entry.

citing papers explorer

Showing 2 of 2 citing papers.

ErrorRadar: Benchmarking Complex Mathematical Reasoning of Multimodal Large Language Models Via Error Detection cs.CL · 2024-10-06 · unverdicted · none · ref 8
ErrorRadar is a new benchmark of 2,500 multimodal K-12 math problems for MLLM error step identification and categorization, where GPT-4o trails human experts by ~10%.
Simulating Word Suggestion Usage in Mobile Typing to Guide Intelligent Text Entry Design cs.HC · 2026-02-06 · unverdicted · none · ref 8
WSTypist is a new RL-based simulation model that reproduces human-like word suggestion strategies, individual differences, and adaptation to design changes in mobile text entry.

Turning large language models into cognitive models

fields

years

verdicts

representative citing papers

citing papers explorer