Paddleocr 3.0 technical report

Cheng Cui, Ting Sun, Manhui Lin, Tingquan Gao, Yubo Zhang, Jiaxuan Liu, Xueqing Wang, Zelun Zhang, Changda Zhou, Hongen Liu, Yue Zhang, Wenyu Lv, Kui Huang, Yichao Zhang, Jing Zhang, Jun Zhang, Yi Liu, Dianhai Yu, Yanjun Ma · 2025

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

browse 3 citing papers

representative citing papers

Towards Self-Explainable Document Visual Question Answering with Chain-of-Explanation Predictions

cs.LG · 2026-05-07 · unverdicted · novelty 7.0

CoExVQA uses a chain-of-explanation to ground DocVQA answers in localized document regions, achieving state-of-the-art explainable performance with a 12% ANLS gain on PFL-DocVQA over prior baselines.

TextSculptor: Training and Benchmarking Scene Text Editing

cs.CV · 2026-05-20 · unverdicted · novelty 6.0

TextSculptor supplies an automated data synthesis pipeline yielding 3.2M samples plus a four-task benchmark that raises open-source scene text editing performance.

MiniCPM-o 4.5: Towards Real-Time Full-Duplex Omni-Modal Interaction

cs.CL · 2026-04-30 · unverdicted · novelty 6.0

MiniCPM-o 4.5 uses the Omni-Flow streaming framework to deliver real-time full-duplex omni-modal interaction with proactive behavior in a 9B model that approaches Gemini 2.5 Flash performance.

citing papers explorer

Showing 3 of 3 citing papers.

Towards Self-Explainable Document Visual Question Answering with Chain-of-Explanation Predictions cs.LG · 2026-05-07 · unverdicted · none · ref 10
CoExVQA uses a chain-of-explanation to ground DocVQA answers in localized document regions, achieving state-of-the-art explainable performance with a 12% ANLS gain on PFL-DocVQA over prior baselines.
TextSculptor: Training and Benchmarking Scene Text Editing cs.CV · 2026-05-20 · unverdicted · none · ref 4
TextSculptor supplies an automated data synthesis pipeline yielding 3.2M samples plus a four-task benchmark that raises open-source scene text editing performance.
MiniCPM-o 4.5: Towards Real-Time Full-Duplex Omni-Modal Interaction cs.CL · 2026-04-30 · unverdicted · none · ref 26
MiniCPM-o 4.5 uses the Omni-Flow streaming framework to deliver real-time full-duplex omni-modal interaction with proactive behavior in a 9B model that approaches Gemini 2.5 Flash performance.

Paddleocr 3.0 technical report

fields

years

verdicts

representative citing papers

citing papers explorer