Transnet v2: An effective deep network architecture for fast shot transition detection

Souček, T · 2008 · arXiv 2008.04838

8 Pith papers cite this work. Polarity classification is still indexing.

8 Pith papers citing it

read on arXiv browse 8 citing papers

citation-role summary

background 2 method 2

citation-polarity summary

background 2 use method 2

representative citing papers

AniMatrix: An Anime Video Generation Model that Thinks in Art, Not Physics

cs.CV · 2026-05-05 · unverdicted · novelty 7.0 · 3 refs

AniMatrix generates anime videos by structuring artistic production rules into a controllable taxonomy and training the model to prioritize those rules over physical realism, achieving top scores from professional animators on prompt understanding and artistic motion.

LPM 1.0: Video-based Character Performance Model

cs.CV · 2026-04-09 · unverdicted · novelty 6.0

LPM 1.0 generates infinite-length, identity-stable, real-time audio-visual conversational performances for single characters using a distilled causal diffusion transformer and a new benchmark.

Lifting Unlabeled Internet-level Data for 3D Scene Understanding

cs.CV · 2026-04-02 · unverdicted · novelty 6.0

Unlabeled web videos processed by designed data engines generate effective training data that yields strong zero-shot and finetuned performance on 3D detection, segmentation, VQA, and navigation.

SkyReels-V2: Infinite-length Film Generative Model

cs.CV · 2025-04-17 · unverdicted · novelty 6.0

SkyReels-V2 produces infinite-length film videos via MLLM-based captioning, progressive pretraining, motion RL, and diffusion forcing with non-decreasing noise schedules.

HunyuanVideo: A Systematic Framework For Large Video Generative Models

cs.CV · 2024-12-03 · unverdicted · novelty 5.0

HunyuanVideo presents a 13B-parameter open-source video generative model with integrated data, architecture, training, and inference systems whose professional evaluations show it outperforming prior SOTA models including Runway Gen-3 and Luma 1.6.

U-CESE: Unified Clip-based Event Search Engine for AI Challenge HCMC 2025

cs.CV · 2026-05-22 · unverdicted · novelty 3.0

U-CESE integrates three CESE modules into a unified clip-based pipeline with DAKE keyframe extraction and ReCap captioning to support consistent multimodal event retrieval across video sources.

MERVIN: A Unified Framework for Multimodal Event Retrieval in Vietnamese News Videos

cs.IR · 2026-05-15 · unverdicted · novelty 3.0

MERVIN is a multimodal retrieval system for Vietnamese news videos that integrates visual and textual features with LLM-enhanced transcripts and reports strong results on a 2025 AI challenge.

MSAVBench: Towards Comprehensive and Reliable Evaluation of Multi-Shot Audio-Video Generation

cs.CV · 2026-05-19

citing papers explorer

Showing 8 of 8 citing papers.

AniMatrix: An Anime Video Generation Model that Thinks in Art, Not Physics cs.CV · 2026-05-05 · unverdicted · none · ref 51 · 3 links
AniMatrix generates anime videos by structuring artistic production rules into a controllable taxonomy and training the model to prioritize those rules over physical realism, achieving top scores from professional animators on prompt understanding and artistic motion.
LPM 1.0: Video-based Character Performance Model cs.CV · 2026-04-09 · unverdicted · none · ref 44
LPM 1.0 generates infinite-length, identity-stable, real-time audio-visual conversational performances for single characters using a distilled causal diffusion transformer and a new benchmark.
Lifting Unlabeled Internet-level Data for 3D Scene Understanding cs.CV · 2026-04-02 · unverdicted · none · ref 99
Unlabeled web videos processed by designed data engines generate effective training data that yields strong zero-shot and finetuned performance on 3D detection, segmentation, VQA, and navigation.
SkyReels-V2: Infinite-length Film Generative Model cs.CV · 2025-04-17 · unverdicted · none · ref 60
SkyReels-V2 produces infinite-length film videos via MLLM-based captioning, progressive pretraining, motion RL, and diffusion forcing with non-decreasing noise schedules.
HunyuanVideo: A Systematic Framework For Large Video Generative Models cs.CV · 2024-12-03 · unverdicted · none · ref 76
HunyuanVideo presents a 13B-parameter open-source video generative model with integrated data, architecture, training, and inference systems whose professional evaluations show it outperforming prior SOTA models including Runway Gen-3 and Luma 1.6.
U-CESE: Unified Clip-based Event Search Engine for AI Challenge HCMC 2025 cs.CV · 2026-05-22 · unverdicted · none · ref 22
U-CESE integrates three CESE modules into a unified clip-based pipeline with DAKE keyframe extraction and ReCap captioning to support consistent multimodal event retrieval across video sources.
MERVIN: A Unified Framework for Multimodal Event Retrieval in Vietnamese News Videos cs.IR · 2026-05-15 · unverdicted · none · ref 14
MERVIN is a multimodal retrieval system for Vietnamese news videos that integrates visual and textual features with LLM-enhanced transcripts and reports strong results on a 2025 AI challenge.
MSAVBench: Towards Comprehensive and Reliable Evaluation of Multi-Shot Audio-Video Generation cs.CV · 2026-05-19 · unreviewed · ref 54

Transnet v2: An effective deep network architecture for fast shot transition detection

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer