hub Mixed citations

InICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1–5

On the detection of synthetic images generated by diffusion models · 2023 · arXiv 2512.02498

Mixed citation behavior. Most common role is background (33%).

13 Pith papers citing it

Background 33% of classified citations

read on arXiv browse 13 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 2 baseline 2 method 2

citation-polarity summary

background 2 baseline 2 use method 2

representative citing papers

How Far Is Document Parsing from Solved? PureDocBench: A Source-TraceableBenchmark across Clean, Degraded, and Real-World Settings

cs.CV · 2026-05-08 · conditional · novelty 8.0

PureDocBench shows document parsing is far from solved, with top models at ~74/100, small specialists competing with large VLMs, and ranking reversals under real degradation.

AEGIS: A Holistic Benchmark for Evaluating Forensic Analysis of AI-Generated Academic Images

cs.CV · 2026-04-30 · unverdicted · novelty 7.0 · 2 refs

AEGIS is a benchmark with 7 academic categories, 39 subtypes, 4 forgery strategies, and multi-dimensional tests showing that leading models like GPT-5.1 achieve only 48.80% overall forensic accuracy on AI-generated academic images.

GlotOCR Bench: OCR Models Still Struggle Beyond a Handful of Unicode Scripts

cs.CL · 2026-04-14 · unverdicted · novelty 7.0

GlotOCR Bench shows that OCR models perform well on fewer than 10 scripts and fail to generalize beyond about 30, with results tracking pretraining coverage and models hallucinating from known scripts on unfamiliar ones.

ParseBench: A Document Parsing Benchmark for AI Agents

cs.CV · 2026-04-09 · accept · novelty 7.0

ParseBench is a new benchmark for document parsing in AI agents that reveals fragmented performance across five semantic dimensions with LlamaParse Agentic scoring highest at 84.9%.

The Character Error Vector: Decomposable errors for page-level OCR evaluation

cs.CV · 2026-04-07 · conditional · novelty 7.0

The Character Error Vector is a decomposable bag-of-characters evaluator for page-level OCR that remains defined under parsing errors and bridges parsing metrics with local CER.

MinerU2.5-Pro: Pushing the Limits of Data-Centric Document Parsing at Scale

cs.CV · 2026-04-06 · unverdicted · novelty 7.0

A fixed 1.2B model trained via diversity-aware sampling, cross-model verification, annotation refinement, and progressive stages achieves new state-of-the-art document parsing accuracy of 95.69 on OmniDocBench v1.6.

Self-Driving Datasets: From 20 Million Papers to Nuanced Biomedical Knowledge at Scale

cs.LG · 2026-05-07 · unverdicted · novelty 6.0 · 2 refs

An LLM entity-tagging pipeline plus multi-agent system extracts ~6.3M nuanced records from 22.5M PubMed papers across six tasks with lower measured error than existing curated databases.

Parser-Oriented Structural Refinement for a Stable Layout Interface in Document Parsing

cs.CV · 2026-04-03 · unverdicted · novelty 6.0

A parser-oriented refinement stage performs set-level reasoning on detector hypotheses to jointly decide instance retention, refine boxes, and set parser input order, cutting reading order errors to 0.024 on OmniDocBench.

Towards Real-World Document Parsing via Realistic Scene Synthesis and Document-Aware Training

cs.CV · 2026-03-25 · unverdicted · novelty 6.0

A realistic scene synthesis strategy and document-aware training recipe enable a 1B-parameter MLLM to achieve superior accuracy and robustness in end-to-end parsing of real-world captured documents.

FastOCR: Dynamic Visual Fixation via KV Cache Pruning for Efficient Document Parsing

cs.CV · 2026-05-17 · unverdicted · novelty 5.0

FastOCR dynamically selects a small subset of visual tokens per decoding step using focal-guided pruning and cross-step reuse, retaining 98% accuracy on Qwen2.5-VL while attending to only 5% of tokens and cutting attention latency by 3x.

RTPrune: Reading-Twice Inspired Token Pruning for Efficient DeepSeek-OCR Inference

cs.CV · 2026-05-01 · unverdicted · novelty 5.0

RTPrune introduces a reading-twice inspired two-stage pruning technique for DeepSeek-OCR that retains 84.25% tokens while delivering 99.47% accuracy and 1.23x faster prefill on OmniDocBench.

From Handwriting to Structured Data: Benchmarking AI Digitisation of Handwritten Forms

cs.CV · 2026-04-14 · unverdicted · novelty 4.0

Frontier multimodal LLMs achieve ~85% accuracy and ~90% weighted F1 on digitizing complex handwritten medical forms, with Gemini 3.1 strongest overall and prompt optimization lifting macro metrics over 60%.

MPDocBench-Parse: Benchmarking Practical Multi-page Document Parsing

cs.AI · 2026-05-21

citing papers explorer

Showing 13 of 13 citing papers.

How Far Is Document Parsing from Solved? PureDocBench: A Source-TraceableBenchmark across Clean, Degraded, and Real-World Settings cs.CV · 2026-05-08 · conditional · none · ref 37
PureDocBench shows document parsing is far from solved, with top models at ~74/100, small specialists competing with large VLMs, and ranking reversals under real degradation.
AEGIS: A Holistic Benchmark for Evaluating Forensic Analysis of AI-Generated Academic Images cs.CV · 2026-04-30 · unverdicted · none · ref 1 · 2 links
AEGIS is a benchmark with 7 academic categories, 39 subtypes, 4 forgery strategies, and multi-dimensional tests showing that leading models like GPT-5.1 achieve only 48.80% overall forensic accuracy on AI-generated academic images.
GlotOCR Bench: OCR Models Still Struggle Beyond a Handful of Unicode Scripts cs.CL · 2026-04-14 · unverdicted · none · ref 32
GlotOCR Bench shows that OCR models perform well on fewer than 10 scripts and fail to generalize beyond about 30, with results tracking pretraining coverage and models hallucinating from known scripts on unfamiliar ones.
ParseBench: A Document Parsing Benchmark for AI Agents cs.CV · 2026-04-09 · accept · none · ref 17
ParseBench is a new benchmark for document parsing in AI agents that reveals fragmented performance across five semantic dimensions with LlamaParse Agentic scoring highest at 84.9%.
The Character Error Vector: Decomposable errors for page-level OCR evaluation cs.CV · 2026-04-07 · conditional · none · ref 24
The Character Error Vector is a decomposable bag-of-characters evaluator for page-level OCR that remains defined under parsing errors and bridges parsing metrics with local CER.
MinerU2.5-Pro: Pushing the Limits of Data-Centric Document Parsing at Scale cs.CV · 2026-04-06 · unverdicted · none · ref 18
A fixed 1.2B model trained via diversity-aware sampling, cross-model verification, annotation refinement, and progressive stages achieves new state-of-the-art document parsing accuracy of 95.69 on OmniDocBench v1.6.
Self-Driving Datasets: From 20 Million Papers to Nuanced Biomedical Knowledge at Scale cs.LG · 2026-05-07 · unverdicted · none · ref 34 · 2 links
An LLM entity-tagging pipeline plus multi-agent system extracts ~6.3M nuanced records from 22.5M PubMed papers across six tasks with lower measured error than existing curated databases.
Parser-Oriented Structural Refinement for a Stable Layout Interface in Document Parsing cs.CV · 2026-04-03 · unverdicted · none · ref 13
A parser-oriented refinement stage performs set-level reasoning on detector hypotheses to jointly decide instance retention, refine boxes, and set parser input order, cutting reading order errors to 0.024 on OmniDocBench.
Towards Real-World Document Parsing via Realistic Scene Synthesis and Document-Aware Training cs.CV · 2026-03-25 · unverdicted · none · ref 19
A realistic scene synthesis strategy and document-aware training recipe enable a 1B-parameter MLLM to achieve superior accuracy and robustness in end-to-end parsing of real-world captured documents.
FastOCR: Dynamic Visual Fixation via KV Cache Pruning for Efficient Document Parsing cs.CV · 2026-05-17 · unverdicted · none · ref 17
FastOCR dynamically selects a small subset of visual tokens per decoding step using focal-guided pruning and cross-step reuse, retaining 98% accuracy on Qwen2.5-VL while attending to only 5% of tokens and cutting attention latency by 3x.
RTPrune: Reading-Twice Inspired Token Pruning for Efficient DeepSeek-OCR Inference cs.CV · 2026-05-01 · unverdicted · none · ref 28
RTPrune introduces a reading-twice inspired two-stage pruning technique for DeepSeek-OCR that retains 84.25% tokens while delivering 99.47% accuracy and 1.23x faster prefill on OmniDocBench.
From Handwriting to Structured Data: Benchmarking AI Digitisation of Handwritten Forms cs.CV · 2026-04-14 · unverdicted · none · ref 11
Frontier multimodal LLMs achieve ~85% accuracy and ~90% weighted F1 on digitizing complex handwritten medical forms, with Gemini 3.1 strongest overall and prompt optimization lifting macro metrics over 60%.
MPDocBench-Parse: Benchmarking Practical Multi-page Document Parsing cs.AI · 2026-05-21 · unreviewed · ref 61

InICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1–5

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer