DPC-VQA decouples a frozen MLLM perceptual prior from a lightweight residual calibration branch to adapt video quality assessment to new scenarios with under 2% trainable parameters and 20% of typical MOS labels.
Optical flow- based spatiotemporal sketch for video representation: A novel framework.IEEE Transactions on Circuits and Systems for Video Technology, 34(8):6963–6977
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 6verdicts
UNVERDICTED 6roles
background 1polarities
background 1representative citing papers
NeuralLVC achieves better lossless compression than H.264 and H.265 on video sequences by combining masked diffusion with temporal conditioning on frame differences.
TSegAgent achieves accurate zero-shot tooth segmentation on 3D dental scans via geometry-aware vision-language reasoning without task-specific training.
DrawVideo is a sketch-guided framework that decomposes long videos into controllable shots using keyframe sketches, appearance prompts, and motion prompts, supported by a new SketchLongVideo dataset.
UHD-GCN-BIQA models structural dependencies among sampled patches via a hybrid kNN graph and residual graph convolutions to achieve competitive PLCC and SRCC with the lowest RMSE on the UHD-IQA benchmark for blind ultra-high-definition image quality assessment.
FSCM is a spectral-guided GAN with state-space modeling, frequency enhancement via wavelets and Fourier, and dual-stream gating for improved infrared hyperspectral image colorization.
citing papers explorer
-
DPC-VQA: Decoupling Quality Perception and Residual Calibration for Video Quality Assessment
DPC-VQA decouples a frozen MLLM perceptual prior from a lightweight residual calibration branch to adapt video quality assessment to new scenarios with under 2% trainable parameters and 20% of typical MOS labels.
-
NeuralLVC: Neural Lossless Video Compression via Masked Diffusion with Temporal Conditioning
NeuralLVC achieves better lossless compression than H.264 and H.265 on video sequences by combining masked diffusion with temporal conditioning on frame differences.
-
TSegAgent: Zero-Shot Tooth Segmentation via Geometry-Aware Vision-Language Agents
TSegAgent achieves accurate zero-shot tooth segmentation on 3D dental scans via geometry-aware vision-language reasoning without task-specific training.
-
DrawVideo: Generating Long Video from Storyboard Keyframe Sketches
DrawVideo is a sketch-guided framework that decomposes long videos into controllable shots using keyframe sketches, appearance prompts, and motion prompts, supported by a new SketchLongVideo dataset.
-
Ultra-High-Definition Image Quality Assessment via Graph Representation Learning
UHD-GCN-BIQA models structural dependencies among sampled patches via a hybrid kNN graph and residual graph convolutions to achieve competitive PLCC and SRCC with the lowest RMSE on the UHD-IQA benchmark for blind ultra-high-definition image quality assessment.
-
FSCM: Frequency-Enhanced Spatial-Spectral Coupled Mamba for Infrared Hyperspectral Image Colorization
FSCM is a spectral-guided GAN with state-space modeling, frequency enhancement via wavelets and Fourier, and dual-stream gating for improved infrared hyperspectral image colorization.