{"paper":{"title":"Evaluating Design Video Generation: Metrics for Compositional Fidelity","license":"http://creativecommons.org/licenses/by/4.0/","headline":"Design video generation now has an automated evaluation framework using four fidelity metrics to replace subjective human judgments.","cross_cats":["cs.AI","cs.CV"],"primary_cat":"cs.GR","authors_text":"Adrienne Deganutti, Dingning Cao, Elad Hirsch, Jaejung Seol, Purvanshi Mehta","submitted_at":"2026-05-15T17:34:05Z","abstract_excerpt":"Generative video models are increasingly used in design animation tasks, yet no standardized evaluation framework exists for this domain. Unlike natural video generation, design animation imposes structured constraints: specific components shall animate with prescribed motion types, directions, speed and timing, while non-animated regions must remain stable and layout structure must be preserved. This paper provides a fully automated evaluation framework organized across four dimensions: layout fidelity, motion correctness, temporal quality, and content fidelity. This eliminates the reliance o"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"This paper provides a fully automated evaluation framework organized across four dimensions: layout fidelity, motion correctness, temporal quality, and content fidelity. This eliminates the reliance on subjective human evaluation and establishes a common basis for benchmarking progress in the field.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That the four proposed dimensions and their automated implementations can fully and accurately capture the structured constraints of design animation (specific component motions, stability of non-animated regions, and layout preservation) without requiring human validation or missing key failure modes.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"Proposes a four-dimensional automated evaluation framework for compositional fidelity in design video generation to enable objective benchmarking without human evaluators.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"Design video generation now has an automated evaluation framework using four fidelity metrics to replace subjective human judgments.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"45a4a56393e00e651e7675b04aa160193587eb3642833733793b542998c19a52"},"source":{"id":"2605.16223","kind":"arxiv","version":1},"verdict":{"id":"6eb44468-d9ef-4995-b554-fe408f54f3ca","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-19T17:28:23.923839Z","strongest_claim":"This paper provides a fully automated evaluation framework organized across four dimensions: layout fidelity, motion correctness, temporal quality, and content fidelity. This eliminates the reliance on subjective human evaluation and establishes a common basis for benchmarking progress in the field.","one_line_summary":"Proposes a four-dimensional automated evaluation framework for compositional fidelity in design video generation to enable objective benchmarking without human evaluators.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That the four proposed dimensions and their automated implementations can fully and accurately capture the structured constraints of design animation (specific component motions, stability of non-animated regions, and layout preservation) without requiring human validation or missing key failure modes.","pith_extraction_headline":"Design video generation now has an automated evaluation framework using four fidelity metrics to replace subjective human judgments."},"integrity":{"clean":false,"summary":{"advisory":1,"critical":0,"by_detector":{"external_links":{"total":1,"advisory":1,"critical":0,"informational":0}},"informational":0},"endpoint":"/pith/2605.16223/integrity.json","findings":[{"note":"URL 'https://github' returned status transport error (transport error: [Errno -3] Temporary failure in name resolution) at last check.","detector":"external_links","severity":"advisory","ref_index":null,"audited_at":"2026-05-19T17:31:31.782798Z","detected_doi":null,"finding_type":"dead_url","verdict_class":"incontrovertible","detected_arxiv_id":null}],"available":true,"detectors_run":[{"name":"doi_title_agreement","ran_at":"2026-05-19T18:01:18.507419Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"shingle_duplication","ran_at":"2026-05-19T17:49:44.658652Z","status":"completed","version":"0.1.0","findings_count":0},{"name":"citation_quote_validity","ran_at":"2026-05-19T17:49:44.148586Z","status":"completed","version":"0.1.0","findings_count":0},{"name":"doi_compliance","ran_at":"2026-05-19T17:36:19.344257Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"ai_meta_artifact","ran_at":"2026-05-19T17:33:24.735382Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"external_links","ran_at":"2026-05-19T17:31:31.782798Z","status":"completed","version":"1.0.0","findings_count":1},{"name":"cited_work_retraction","ran_at":"2026-05-19T17:22:05.510845Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"claim_evidence","ran_at":"2026-05-19T16:41:55.381232Z","status":"completed","version":"1.0.0","findings_count":0}],"snapshot_sha256":"e27f97a4e11c236552130946b1dd25afa8a9242ba23248fe07c63a5bad2d62fa"},"references":{"count":23,"sample":[{"doi":"","year":null,"title":"Distractor-aware","work_id":"04c36d4e-ab7c-4159-b942-661d3662c9d7","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"Hirsch, Elad and Yadav, Shubham and Garg, Mohit and Mehta, Purvanshi , journal=. L","work_id":"844edb57-7d6c-4e7d-b626-ae5039dea0a8","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"Postercraft: Rethinking high-quality aesthetic poster generation in a unified framework","work_id":"6750b614-982f-458c-aaa7-c20a95c76b9b","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"Deka, Biplab and Huang, Zifeng and Franzen, Chad and Hibschman, Joshua and Afergan, Daniel and Li, Yang and Nichols, Jeffrey and Kumar, Ranjitha , booktitle=. Rico:","work_id":"4ec02177-8d48-49db-bd9d-13b83b139a92","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"Yamaguchi, Kota , booktitle=. Canvasvae:","work_id":"f51f7675-0e2c-4983-a13a-cd6a32ed66ce","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":23,"snapshot_sha256":"a764594fba59b67e6ae5ea9c233d420a1d398014a0e68da22571fa79fd241ea9","internal_anchors":0},"formal_canon":{"evidence_count":2,"snapshot_sha256":"eebec38f6f13df653cd1b0db72376c7f1a30d744fedc9a0bf1b9b636e4bda33a"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}