A pipeline combining specialized OCR with Vision-Language Models improves transcription quality and speaker identification for Italian parliamentary speeches preserved as scanned documents.
We evaluate both the OCR transcription quality and the speaker tagging accuracy using the benchmark dataset re- leased by the authors
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.DL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Transcription and Recognition of Italian Parliamentary Speeches Using Vision-Language Models
A pipeline combining specialized OCR with Vision-Language Models improves transcription quality and speaker identification for Italian parliamentary speeches preserved as scanned documents.