{"paper":{"title":"PathVQA: 30000+ Questions for Medical Visual Question Answering","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"The first pathology visual question answering dataset is created with 32,799 manually verified questions from 4,998 images.","cross_cats":["cs.AI"],"primary_cat":"cs.CL","authors_text":"Eric Xing, Luntian Mou, Pengtao Xie, Xuehai He, Yichen Zhang","submitted_at":"2020-03-07T17:55:41Z","abstract_excerpt":"Is it possible to develop an \"AI Pathologist\" to pass the board-certified examination of the American Board of Pathology? To achieve this goal, the first step is to create a visual question answering (VQA) dataset where the AI agent is presented with a pathology image together with a question and is asked to give the correct answer. Our work makes the first attempt to build such a dataset. Different from creating general-domain VQA datasets where the images are widely accessible and there are many crowdsourcing workers available and capable of generating question-answer pairs, developing a med"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"To our best knowledge, this is the first dataset for pathology VQA.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That questions automatically generated from image captions using NLP, followed by manual checks, produce medically accurate and representative questions that pathologists would actually ask when viewing the images.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"PathVQA is the first public dataset of over 32,000 questions on nearly 5,000 pathology images for medical visual question answering.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"The first pathology visual question answering dataset is created with 32,799 manually verified questions from 4,998 images.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"59db6ca86cbe0010dcd23bcee75e82b70e654adc32378bbc00f0c8a950518060"},"source":{"id":"2003.10286","kind":"arxiv","version":1},"verdict":{"id":"234bda4a-a97e-42bc-87f9-c037f6bc8997","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-15T05:09:48.556384Z","strongest_claim":"To our best knowledge, this is the first dataset for pathology VQA.","one_line_summary":"PathVQA is the first public dataset of over 32,000 questions on nearly 5,000 pathology images for medical visual question answering.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That questions automatically generated from image captions using NLP, followed by manual checks, produce medically accurate and representative questions that pathologists would actually ask when viewing the images.","pith_extraction_headline":"The first pathology visual question answering dataset is created with 32,799 manually verified questions from 4,998 images."},"references":{"count":34,"sample":[{"doi":"","year":2015,"title":"Vqa: Visual question answering","work_id":"b5a5494b-c898-4d81-8c7a-27fa15869e0f","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2014,"title":"A multi-world approach to question answering about real-world scenes based on uncertain input","work_id":"bad71b41-4475-4537-8fbf-9c90209ec63a","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2015,"title":"Image question answering: A visual semantic embedding model and a new dataset","work_id":"93fdc4ae-0cf3-4648-a98d-6ec3c285f604","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2017,"title":"Clevr: A diagnostic dataset for compositional language and elementary visual reasoning","work_id":"75988bdb-b9c2-48eb-93e8-e7617400c787","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2017,"title":"Making the v in vqa matter: Elevating the role of image understanding in visual question answering","work_id":"429fd41f-767c-49e9-bdb0-3b99e1135349","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":34,"snapshot_sha256":"1536d37929198a432a2fd04ab19840d0a28cb90c24b55b554728167450b3137b","internal_anchors":4},"formal_canon":{"evidence_count":2,"snapshot_sha256":"11f563d420e848fc12ed11ce6c9c7194fc951ac294c4303b8d3a2c266f10f412"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}