hub Canonical reference

Bovik, Hamid R

doi: 10 · 2003 · arXiv 2003.819861

Canonical reference. 75% of citing Pith papers cite this work as background.

69 Pith papers citing it

Background 75% of classified citations

read on arXiv browse 69 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 9 method 2 dataset 1

citation-polarity summary

background 9 use method 2 use dataset 1

representative citing papers

Animation2Code: Evaluating Temporal Visual Reasoning in Video-to-Code Generation

cs.CV · 2026-06-26 · unverdicted · novelty 7.0

Animation2Code benchmark with 1,069 videos tests VLMs on generating animation code, showing persistent failures in temporal consistency despite good visual matches.

AutoMedBench: Towards Medical AutoResearch with Agentic AI Models

cs.AI · 2026-06-01 · conditional · novelty 7.0

AutoMedBench evaluates AI agents on long-horizon medical workflows across five stages and finds validation and submission as dominant failure points based on thousands of runs.

A Systematic Benchmark of Intraoperative Ultrasound-to-MR Synthesis for Brain Tumour Surgery

cs.CV · 2026-05-30 · conditional · novelty 7.0

On the public ReMIND dataset, a systematic benchmark of six synthesis models across 48 experiments finds LPIPS correlates with downstream segmentation utility while SSIM does not, with SynDiff-2.5D performing best.

DirectorBench: Diagnosing Long-Form Video Generation with Personalized Multi-Agent Evaluation

cs.CL · 2026-05-28 · unverdicted · novelty 7.0

DirectorBench is a profile-aware diagnostic benchmark that localizes bottlenecks in long-form video generation workflows using structured checkpoints and multi-agent evaluation.

Loki: Representation over Architecture for Diffusion-Based Portrait Animation

cs.CV · 2026-05-22 · unverdicted · novelty 7.0

Loki replaces RGB conditioning stacks with identity-orthogonal parametric face encodings rasterized for diffusion, achieving efficient cross-ID portrait animation without cross-ID training data.

Your Neighbors Know: Leveraging Local Neighborhoods for Backdoor Detection in Decentralized Learning

cs.LG · 2026-05-19 · unverdicted · novelty 7.0 · 2 refs

Argus enables backdoor detection in decentralized ML by collaborative neighbor-based validation of triggers, backed by convergence theory and reducing attack success by up to 90% on tested datasets.

PanoPlane: Plane-Aware Panoramic Completion for Sparse-View Indoor 3D Gaussian Splatting

cs.CV · 2026-05-13 · unverdicted · novelty 7.0

PanoPlane achieves up to 17.8% PSNR gains in sparse-view indoor novel view synthesis by using training-free plane-aware panoramic completion to supervise 3D Gaussian Splatting.

GuardMarkGS: Unified Ownership Tracing and Edit Deterrence for 3D Gaussian Splatting

cs.CV · 2026-05-13 · unverdicted · novelty 7.0

GuardMarkGS unifies watermarking and adversarial edit deterrence into a single optimization framework for protecting 3D Gaussian Splatting assets.

SyMTRS: Benchmark Multi-Task Synthetic Dataset for Depth, Domain Adaptation and Super-Resolution in Aerial Imagery

cs.CV · 2026-04-23 · unverdicted · novelty 7.0

A new large-scale synthetic multi-task benchmark dataset supplying pixel-perfect depth, domain-shifted night imagery, and multi-scale low-resolution pairs for aerial remote sensing.

MESA: A Training-Free Multi-Exemplar Deep Framework for Restoring Ancient Inscription Textures

cs.CV · 2026-04-19 · unverdicted · novelty 7.0

MESA restores ancient inscription textures via multi-exemplar style transfer from VGG19 features with per-layer exemplar selection and OCR-derived weights, without any model training.

GeRM: A Generative Rendering Model From Physically Realistic to Photorealistic

cs.CV · 2026-04-10 · unverdicted · novelty 7.0 · 2 refs

GeRM learns a distribution transfer vector field via a multi-condition ControlNet to convert physically-based renders into photorealistic images using text prompts and a 50K expert-curated dataset.

LumaFlux: Lifting 8-Bit Worlds to HDR Reality with Physically-Guided Diffusion Transformers

cs.CV · 2026-04-03 · unverdicted · novelty 7.0

LumaFlux is a physically and perceptually guided diffusion transformer for SDR-to-HDR conversion that introduces PGA, PCM, and HDR Residual Coupler modules plus a new training corpus and benchmark, outperforming prior ITM methods.

SNIC: Synthesized Noisy Images using Calibration

eess.IV · 2025-12-17 · unverdicted · novelty 7.0

A sensor-specific calibration pipeline using dark frames produces synthesized noisy RAW images that close 54-64% of the PSNR gap to real noise versus manufacturer profiles, accompanied by the open SNIC dataset of over 6600 paired images.

Delta Rectified Flow Sampling for Text-to-Image Editing

cs.CV · 2025-09-01 · unverdicted · novelty 7.0

DRFS is a new inversion-free editing technique for rectified flow models that models source-target velocity discrepancies and applies a time-dependent shift to improve fidelity and unify prior methods like DDS and FlowEdit.

Task complexity shapes internal representations and robustness in neural networks

cs.LG · 2025-08-07 · unverdicted · novelty 7.0

Harder classification tasks produce neural representations whose accuracy collapses under binarization and shuffling while easier tasks remain robust, defining task complexity via the performance gap between full-precision and perturbed networks.

PhotIQA: A photoacoustic image data set with image quality ratings

eess.IV · 2025-07-04 · conditional · novelty 7.0

PhotIQA is a new public dataset of 1134 expert-rated photoacoustic images for benchmarking image quality assessment in medical imaging.

SLAM&Render: A Benchmark for the Intersection Between Neural Rendering, Gaussian Splatting and SLAM

cs.RO · 2025-04-18 · unverdicted · novelty 7.0

Presents SLAM&Render, a robot-recorded benchmark dataset with 40 multi-modal sequences for testing SLAM, novel view synthesis, and Gaussian Splatting under controlled variations in lighting, arrangements, and occlusions.

Cyclic 2.5D Perceptual Loss for Cross-Modal 3D Medical Image Synthesis: T1w MRI to Tau PET

eess.IV · 2024-06-18 · unverdicted · novelty 7.0

Proposes a cyclic 2.5D perceptual loss with manufacturer SUVR standardization for T1w MRI to tau PET synthesis, reporting improved regional agreement on ADNI and SCAN cohorts across U-Net, UNETR, SwinUNETR, CycleGAN, and Pix2Pix.

Q-Align: Teaching LMMs for Visual Scoring via Discrete Text-Defined Levels

cs.CV · 2023-12-28 · conditional · novelty 7.0

Q-Align trains LMMs on discrete text-defined levels for visual scoring, achieving SOTA on IQA, IAA, and VQA while unifying the tasks in OneAlign.

Recovering Sharp Conductivity Features in the Finite-Data Calder\'on Problem with Physics-Informed Neural Networks

cs.LG · 2026-06-26 · unverdicted · novelty 6.0

A PINN framework with separate networks for conductivity and potentials, multiscale wavelet excitations, and FFE recovers dominant conductivity structures from finite DtN data with 3-12% relative error on synthetic tests, with FFE aiding sharp features.

Differential Unfolding: Efficient Unfolding Reconstruction for Video Snapshot Compressive Imaging

cs.CV · 2026-06-23 · unverdicted · novelty 6.0

Differential Unfolding replaces uniform stacking in deep unfolding networks with a heterogeneous structure of anchoring and differential evolution stages to achieve better accuracy-efficiency trade-offs in video SCI reconstruction.

Temporally Aware Densification for Dynamic 3D Gaussian Splatting

cs.CV · 2026-06-22 · unverdicted · novelty 6.0

Introduces Visibility-Aware Densification with Temporally-Adaptive Thresholding and Temporal Offset Warping to improve dynamic region quality in 3D Gaussian Splatting on three benchmarks.

Scene-Adaptive Nonlinear Tone Curves for Pseudo Ground-Truth Generation in Low-Light 3D Gaussian Splatting

cs.CV · 2026-06-10 · unverdicted · novelty 6.0

Scene-adaptive nonlinear tone curves (ASE and AP3) with percentile normalisation and offset outperform linear gain for pseudo-GT generation in low-light 3DGS, delivering PSNR gains up to 4.34 dB on LOM and 3.25 dB on RealX3D across 21 scenes.

Seeing What Matters: Perceptual Wrapper with Common Randomness for 3D Gaussian Splatting

cs.CV · 2026-06-10 · unverdicted · novelty 6.0

A plug-and-play perceptual wrapper using common random noise and Wasserstein Distortion supervision improves texture quality and reduces model size in 3D Gaussian Splatting.

citing papers explorer

Showing 35 of 35 citing papers after filters.

Animation2Code: Evaluating Temporal Visual Reasoning in Video-to-Code Generation cs.CV · 2026-06-26 · unverdicted · none · ref 39
Animation2Code benchmark with 1,069 videos tests VLMs on generating animation code, showing persistent failures in temporal consistency despite good visual matches.
A Systematic Benchmark of Intraoperative Ultrasound-to-MR Synthesis for Brain Tumour Surgery cs.CV · 2026-05-30 · conditional · none · ref 47
On the public ReMIND dataset, a systematic benchmark of six synthesis models across 48 experiments finds LPIPS correlates with downstream segmentation utility while SSIM does not, with SynDiff-2.5D performing best.
Loki: Representation over Architecture for Diffusion-Based Portrait Animation cs.CV · 2026-05-22 · unverdicted · none · ref 24
Loki replaces RGB conditioning stacks with identity-orthogonal parametric face encodings rasterized for diffusion, achieving efficient cross-ID portrait animation without cross-ID training data.
PanoPlane: Plane-Aware Panoramic Completion for Sparse-View Indoor 3D Gaussian Splatting cs.CV · 2026-05-13 · unverdicted · none · ref 54
PanoPlane achieves up to 17.8% PSNR gains in sparse-view indoor novel view synthesis by using training-free plane-aware panoramic completion to supervise 3D Gaussian Splatting.
GuardMarkGS: Unified Ownership Tracing and Edit Deterrence for 3D Gaussian Splatting cs.CV · 2026-05-13 · unverdicted · none · ref 47
GuardMarkGS unifies watermarking and adversarial edit deterrence into a single optimization framework for protecting 3D Gaussian Splatting assets.
SyMTRS: Benchmark Multi-Task Synthetic Dataset for Depth, Domain Adaptation and Super-Resolution in Aerial Imagery cs.CV · 2026-04-23 · unverdicted · none · ref 39
A new large-scale synthetic multi-task benchmark dataset supplying pixel-perfect depth, domain-shifted night imagery, and multi-scale low-resolution pairs for aerial remote sensing.
MESA: A Training-Free Multi-Exemplar Deep Framework for Restoring Ancient Inscription Textures cs.CV · 2026-04-19 · unverdicted · none · ref 25
MESA restores ancient inscription textures via multi-exemplar style transfer from VGG19 features with per-layer exemplar selection and OCR-derived weights, without any model training.
GeRM: A Generative Rendering Model From Physically Realistic to Photorealistic cs.CV · 2026-04-10 · unverdicted · none · ref 3 · 2 links
GeRM learns a distribution transfer vector field via a multi-condition ControlNet to convert physically-based renders into photorealistic images using text prompts and a 50K expert-curated dataset.
LumaFlux: Lifting 8-Bit Worlds to HDR Reality with Physically-Guided Diffusion Transformers cs.CV · 2026-04-03 · unverdicted · none · ref 10
LumaFlux is a physically and perceptually guided diffusion transformer for SDR-to-HDR conversion that introduces PGA, PCM, and HDR Residual Coupler modules plus a new training corpus and benchmark, outperforming prior ITM methods.
Differential Unfolding: Efficient Unfolding Reconstruction for Video Snapshot Compressive Imaging cs.CV · 2026-06-23 · unverdicted · none · ref 30
Differential Unfolding replaces uniform stacking in deep unfolding networks with a heterogeneous structure of anchoring and differential evolution stages to achieve better accuracy-efficiency trade-offs in video SCI reconstruction.
Temporally Aware Densification for Dynamic 3D Gaussian Splatting cs.CV · 2026-06-22 · unverdicted · none · ref 32
Introduces Visibility-Aware Densification with Temporally-Adaptive Thresholding and Temporal Offset Warping to improve dynamic region quality in 3D Gaussian Splatting on three benchmarks.
Scene-Adaptive Nonlinear Tone Curves for Pseudo Ground-Truth Generation in Low-Light 3D Gaussian Splatting cs.CV · 2026-06-10 · unverdicted · none · ref 40
Scene-adaptive nonlinear tone curves (ASE and AP3) with percentile normalisation and offset outperform linear gain for pseudo-GT generation in low-light 3DGS, delivering PSNR gains up to 4.34 dB on LOM and 3.25 dB on RealX3D across 21 scenes.
Seeing What Matters: Perceptual Wrapper with Common Randomness for 3D Gaussian Splatting cs.CV · 2026-06-10 · unverdicted · none · ref 41
A plug-and-play perceptual wrapper using common random noise and Wasserstein Distortion supervision improves texture quality and reduces model size in 3D Gaussian Splatting.
LiFT: Lifted Inter-slice Feature Trajectories for 3D Image Generation from 2D Generators cs.CV · 2026-05-18 · unverdicted · none · ref 54
LiFT factorizes 3D medical volume synthesis into per-slice 2D generation and inter-slice trajectory learning, using a tri-planar drifting loss for unconditional coherence and a z-context mixer for paired translation tasks.
MSIQ: Moment-based Scale-Invariant Quality Measure for Single Image Super-Resolution cs.CV · 2026-05-17 · unverdicted · none · ref 11
MSIQ is a scale-invariant, model-free quality metric for single image super-resolution using normalized central geometric moments for direct comparison of different-resolution images.
LiBrA-Net: Lie-Algebraic Bilateral Affine Fields for Real-Time 4K Video Dehazing cs.CV · 2026-05-12 · unverdicted · none · ref 39
LiBrA-Net achieves real-time native 4K video dehazing via Lie-algebraic bilateral affine fields and releases the first 4K paired dehazing video benchmark with per-frame annotations.
AsyncEvGS: Asynchronous Event-Assisted Gaussian Splatting for Handheld Motion-Blurred Scenes cs.CV · 2026-05-08 · unverdicted · none · ref 42
AsyncEvGS reconstructs high-fidelity 3D scenes from motion-blurred images by first deblurring via event data then using VGGT-based pose estimation and structure-driven losses inside Gaussian Splatting.
SIFT-VTON: Geometric Correspondence Supervision on Cross-Attention for Virtual Try-On cs.CV · 2026-05-02 · unverdicted · none · ref 29
SIFT-VTON adds explicit geometric supervision from SIFT keypoints to diffusion-based virtual try-on to improve spatial alignment and detail preservation.
CAHAL: Clinically Applicable resolution enHAncement for Low-resolution MRI scans cs.CV · 2026-04-20 · unverdicted · none · ref 288
CAHAL introduces a physics-informed mixture-of-experts super-resolution network for clinical MRI that conditions on resolution and anisotropy and uses edge-penalised, Fourier, and segmentation-guided losses to reduce hallucinations compared with prior generative methods.
CDSA-Net:Collaborative Decoupling of Vascular Structure and Background for High-Fidelity Coronary Digital Subtraction Angiography cs.CV · 2026-04-19 · unverdicted · none · ref 52
CDSA-Net decouples vascular structure extraction and background restoration in coronary DSA via hierarchical geometric priors and adaptive noise modeling to eliminate artifacts while preserving tissue fidelity.
PostureObjectstitch: Anomaly Image Generation Considering Assembly Relationships in Industrial Scenarios cs.CV · 2026-04-15 · unverdicted · none · ref 42
PostureObjectStitch generates assembly-aware anomaly images by decoupling multi-view features into high-frequency, texture and RGB components, modulating them temporally in a diffusion model, and applying conditional loss plus geometric priors to preserve correct component relationships.
ReplicateAnyScene: Zero-Shot Video-to-3D Composition via Textual-Visual-Spatial Alignment cs.CV · 2026-04-12 · unverdicted · none · ref 57
ReplicateAnyScene performs fully automated zero-shot video-to-compositional-3D reconstruction by cascading alignments of generic priors from vision foundation models across textual, visual, and spatial dimensions.
GIF: A Conditional Multimodal Generative Framework for IR Drop Imaging in Chip Layouts cs.CV · 2026-04-11 · unverdicted · none · ref 23
GIF fuses geometrical image features and logical graph topology in a conditional diffusion model to generate high-quality IR drop images for chip layouts, outperforming prior ML methods on CircuitNet-N28 with SSIM 0.78, Pearson 0.95, PSNR 21.77, and NMAE 0.026.
SyncBreaker:Stage-Aware Multimodal Adversarial Attacks on Audio-Driven Talking Head Generation cs.CV · 2026-04-09 · unverdicted · none · ref 53
SyncBreaker jointly attacks image and audio streams with Multi-Interval Sampling and Cross-Attention Fooling to degrade speech-driven talking head generation more than single-modality baselines.
Single-Stage Signal Attenuation Diffusion Model for Low-Light Image Enhancement and Denoising cs.CV · 2026-04-07 · unverdicted · none · ref 31
SADM adds a signal attenuation coefficient to the diffusion forward process so that reverse denoising simultaneously recovers brightness and suppresses noise without extra stages or correction modules.
VisPhyWorld: Probing Physical Reasoning via Code-Driven Video Reconstruction cs.CV · 2026-02-09 · unverdicted · none · ref 41
VisPhyWorld evaluates MLLMs' physical reasoning via executable code generation for video reconstruction, with VisPhyBench showing strong semantics but weak parameter inference and dynamics simulation.
RefGlass-GS: A UAV-Enabled Fusion Framework for Photorealistic, Semantic and Interactive Digitization of Reflective Glass Facades via Gaussian Splatting cs.CV · 2026-06-27 · unverdicted · none · ref 54
RefGlass-GS is a fusion framework using UAV data, MAP-based panel segmentation, viewpoint optimization, and modified Gaussian Splatting with Reflection MLP to achieve improved photorealistic and semantic modeling of reflective glass facades.
SignNet-1M: Large-Scale Multilingual Sign Language Video Dataset with Downstream Benchmarks cs.CV · 2026-06-23 · unverdicted · none · ref 30
The paper releases SignNet-1M, a 1M-scale augmented dataset for ASL, CSL and DGS with 3DGS and diffusion-based variations, plus benchmarks showing improved cross-shift generalization.
StereoGenBench: A Synthetic Multi-Camera Benchmark for Stereo Generation under Controlled Baseline Regimes cs.CV · 2026-05-22 · unverdicted · none · ref 36
StereoGenBench is a new synthetic benchmark dataset featuring calibrated multi-baseline stereo pairs with dense metric depth, intrinsics, and poses from Unreal Engine renders for controlled evaluation of stereo generation.
Flow matching for Sentinel-2 super-resolution: implementation, application, and implications cs.CV · 2026-05-01 · unverdicted · none · ref 52
Flow matching achieves single-step pixel accuracy and 20-step perceptual quality for Sentinel-2 super-resolution, outperforming diffusion and Real-ESRGAN while enabling large-scale 2.5 m land-cover products.
Training-inference input alignment outweighs framework choice in longitudinal retinal image prediction cs.CV · 2026-04-18 · unverdicted · none · ref 30
Training-inference input alignment outweighs framework choice for longitudinal retinal image prediction, with deterministic regression matching complex models when acquisition variability dominates disease progression.
GeneralVLA-2: Geometry-Aware Reconstruction and Governed Memory for Robot Planning cs.CV · 2026-06-16 · unverdicted · none · ref 24
GeneralVLA-2 introduces GeoFuse-MV3D for improved multi-view 3D reconstruction and a governed memory system, demonstrating modest gains on 3D object and task benchmarks.
Contrast-Informed Augmentation and Domain-Adversarial Training for Adult-to-Neonatal MR Reconstruction Generalization cs.CV · 2026-06-11 · unverdicted · none · ref 11
Mixed training with contrast-informed augmentation and domain-adversarial training improves E2E-VarNet performance on neonatal T2-weighted brain MR reconstruction at R=4 and R=8 compared to adult-only training.
Low-Magnification SEM May Suffice: Interpretable Deep Learning for Multi-Scale Fracture-Cause Classification in Zirconia-Toughened Alumina cs.CV · 2026-05-28 · unverdicted · none · ref 18
A fine-tuned ViT on 8493 SEM images classifies fracture causes in zirconia-toughened alumina at 0.907 accuracy and 0.888 macro-F1, with comparable performance at 50x versus higher magnifications.
MSDS: Deep Structural Similarity with Multiscale Representation cs.CV · 2026-04-21 · unverdicted · none · ref 4
MSDS computes DeepSSIM at multiple pyramid scales and fuses the scores with learned weights, producing consistent improvements over single-scale DeepSSIM on IQA benchmarks with negligible extra cost.

Bovik, Hamid R

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer