pith. sign in

archive

Every paper Pith has read. Search by title, abstract, or pith.

9568 papers in cs.CV · page 4

  1. cs.CV 2026-05-21 reviewed
    Synthetic RAW data yields same low-light detection metrics as real

    Making the Discrete Continuous: Synthetic RAW Augmentations for Fine-Grained Evaluation of Person Detection Performance in Low Light

    Valeria Pais +5

  2. cs.CV 2026-05-21 reviewed
    Pre-VLA lifts VLA success rates from 31% to 38%

    Pre-VLA: Preemptive Runtime Verification for Reliable Vision-Language-Action and World-Model Rollouts

    Zhen Sun +8

  3. eess.IV 2026-05-21 reviewed
    Block-sparse model separates rPPG signals from video noise

    Time-varying rPPG signal separation via block-sparse signal model

    Kosuke Kurihara +3

  4. cs.CV 2026-05-21 reviewed
    Dual-shutter pairs invert motion blur and distortion

    Moment-Reenacting: Inverse Motion Degradation with Cross-shutter Guidance

    Xiang Ji +4

  5. cs.CV 2026-05-21 reviewed
    Paired blur and distortion images recover high-speed motion

    Moment-Reenacting: Inverse Motion Degradation with Cross-shutter Guidance

    Xiang Ji +4

  6. cs.CV 2026-05-21 reviewed
    Table structure recovered by predicting grid counts and separators directly

    FastTab: A Fast Table Recognizer with a Tiny Recursive Module and 1D Transformers

    Laziz Hamdi +3

  7. cs.CV 2026-05-21 reviewed
    GenRe generalizes 3D urban scenes to new viewpoints in minutes

    Diffusion-guided Generalizable Enhancer for Urban Scene Reconstruction

    Henry Che +5

  8. cs.CV 2026-05-21 reviewed
    Explicit baseline fixes attribution errors in neural explanations

    The Neglected Baseline in Model Interpretation

    Yongjin Cui +1

  9. cs.CV 2026-05-21 reviewed
    New benchmark and GRPO method lift MLLMs past proprietary models on receipt reasoning

    From Recognition to Reasoning: Benchmarking and Enhancing MLLMs on Real-World Receipt Document Understanding

    Yandi Wang +7

  10. cs.CV 2026-05-21 reviewed
    Lesion grounding lifts ophthalmic VQA accuracy and clarity

    Towards Clinically Interpretable Ophthalmic VQA via Spatially-Grounded Lesion Evidence

    Xingyue Wang +6

  11. cs.CV 2026-05-21 reviewed
    LLMs recognize activities from muscle signals after language mapping

    Translating Signals to Languages for sEMG-Based Activity Recognition

    Ming Wang +5

  12. cs.CV 2026-05-21 reviewed
    Benchmark shows AI unreliable on agricultural tool tasks

    AgroTools: A Benchmark for Tool-Augmented Multimodal Agents in Agriculture

    Zi Ye +12

  13. cs.CV 2026-05-21 reviewed
    3D eye prior synthesizes training data for any new AR/VR tracker

    GazePrior: Zero-Shot AR/VR Eye Tracking via Learned 3D Gaze Reconstruction

    Corentin Dumery +5

  14. cs.CV 2026-05-21 reviewed
    Multiple metrics needed to judge liver vessel segmentation

    VEELA: A Clinically-Constrained Benchmark for Liver Vessel Segmentation in Computed Tomography Angiography

    Ziya Ata Yaz{\i}c{\i} +21

  15. cs.CV 2026-05-21 reviewed
    QuantSR+ raises 2-bit SR accuracy by 0.29 dB while cutting ops 87.9%

    QuantSR+: Pushing the Limit of Quantized Image Super-Resolution Networks

    Haotong Qin +6

  16. cs.CV 2026-05-21 reviewed
    MLLM planner in ViT space guides DiT to SOTA video generation and edits

    Bernini: Latent Semantic Planning for Video Diffusion

    Bernini Team: Chenchen Liu +10

  17. cs.CV 2026-05-21 reviewed
    Watermarking 4D splats by gating at motion-curvature instants

    4D-GSW: Kinematic-Aware Spatio-Temporal Consistent Watermarking for 4D Gaussian Splatting

    Sifan Zhou +3

  18. cs.CV 2026-05-21 reviewed
    Multispectral LiDAR lifts 3D land cover mIoU by up to 7.8 points

    3D LULC classification using multispectral LiDAR and deep learning: current and prospective schemes

    Narges Takhtkeshha +5

  19. cs.CV 2026-05-21 reviewed
    K-space hybrid model holds up better for breast lesion segmentation under acceleration

    Robustness of breast lesion segmentation under MRI undersampling improves with k-space-aware deep learning

    Lukas T. Rotkopf +5

  20. cs.CV 2026-05-21 reviewed
    Anchor swaps erase specific identities from face generators

    PIU: Proximity-guided Identity Unlearning in ID-Conditioned Diffusion Models

    Jose Edgar Hernandez Cancino Estrada +5

  21. cs.RO 2026-05-21 reviewed
    DEVO exports sparse point clouds matching EMVS at 5 cm

    Extending Deep Event Visual Odometry with Sparse Point-Cloud Export

    Alireza Safdari +1

  22. cs.CV 2026-05-21 reviewed
    YOLOv2 with FPN and switchable convolution hits 68% mAP on virus patches

    Detection of Virus and Small Cell Patches in Foci Images Using Switchable Convolution and Feature Pyramid Networks

    Amrita Singh +1

  23. cs.CV 2026-05-21 reviewed
    Curved fractal patches fool VIS-IR VLMs

    Exposing Vulnerabilities in Visible-Infrared VLMs: A Unified Geometric Adversarial Framework with Cross-Task Transferability

    Xiang Chen +8

  24. cs.RO 2026-05-21 reviewed
    4D trajectories and sparse tracking enable zero-shot robot-object tasks

    Imagine2Real: Towards Zero-shot Humanoid-Object Interaction via Video Generative Priors

    Jiahe Chen +9

  25. cs.RO 2026-05-21 reviewed
    Sparse keypoints in behavior model enable zero-shot humanoid interactions

    Imagine2Real: Towards Zero-shot Humanoid-Object Interaction via Video Generative Priors

    Jiahe Chen +9

  26. cs.CV 2026-05-21 reviewed
    Multi-grained compression lifts long video QA accuracy

    MuKV: Multi-Grained KV Cache Compression for Long Streaming Video Question-Answering

    Junbin Xiao +4

  27. cs.NI 2026-05-21 reviewed
    YOLOv8 recall falls below 40% under strong turbulence in satellite images

    Impact of Atmospheric Turbulence and Pointing Error on Earth Observation

    Celia S\'anchez-de-Miguel +4

  28. cs.LG 2026-05-21 reviewed
    Evidence hierarchy lifts Bayesian threat classification to 95%

    An Evidence Hierarchy for Bayesian Object Classification via OSINT-Aided Heterogeneous Sensor Fusion

    Jan Nausner +1

  29. cs.CV 2026-05-21 reviewed
    OMR tops matched music score search

    Direct content-based retrieval from music scores images

    Noelia Luna-Barahona +4

  30. cs.CV 2026-05-21 reviewed
    Graphs plus diffusion improve tumor segmentation with missing MRI scans

    D3Seg: Dependency-Aware Diffusion for Brain Tumor Segmentation with Missing Modalities

    Danish Ali +3

  31. cs.CV 2026-05-21 reviewed
    Model recovers 3D hand poses from distant room corners

    REACH: Hand Pose Estimation from Room Corners

    Shu Nakamura +6

  32. cs.CV 2026-05-21 reviewed
    Semi-supervised UniMatch V2 segments weather-degraded images

    A Robust Semantic Segmentation Pipeline for the CVPR 2026 8th UG2+ Challenge Track 2

    Jinming Chai +3

  33. cs.CV 2026-05-21 reviewed
    Semi-supervised training lifts segmentation in bad weather

    A Robust Semantic Segmentation Pipeline for the CVPR 2026 8th UG2+ Challenge Track 2

    Jinming Chai +3

  34. cs.CV 2026-05-21 reviewed
    Anatomy residual pathway lifts VCE mAP to 0.3409

    GALAR-TemporalNet v2: Anatomy-Guided Dual-Branch Temporal Classification with Bidirectional Mamba and Dual-Graph GCN for Video Capsule Endoscopy -- after competition results

    Jiye Won (1) +2

  35. cs.CV 2026-05-21 reviewed
    Self-evolving pool optimizes image restoration agent

    EvoIR-Agent: Self-Evolving Image Restoration Agentic System via Experience-Driven Learning

    Kailin Zhuang +2

  36. cs.CV 2026-05-21 reviewed
    Text from LLMs guides zero-shot action localization in videos

    Zero-Shot Temporal Action Localization Through Textual Guidance

    Benedetta Liberatori +5

  37. cs.CV 2026-05-21 reviewed
    Video models top open suturing skill challenge

    OSS: Open Suturing Skills Vision-Based Assessment Challenge 2024-2025

    Hanna Hoffmann +56

  38. cs.CV 2026-05-21 reviewed
    Graph of patches cuts UHD quality prediction error

    Ultra-High-Definition Image Quality Assessment via Graph Representation Learning

    Shaode Yu +6

  39. cs.CV 2026-05-21 reviewed
    Feed-forward model reconstructs 4D scenes without camera poses

    No Pose, No Problem in 4D: Feed-Forward Dynamic Gaussians from Unposed Multi-View Videos

    Matteo Balice +5

  40. cs.CV 2026-05-21 reviewed
    Events and illumination collaborate to fix low-light photos

    Event-Illumination Collaborative Low-light Image Enhancement with a High-resolution Real-world Dataset

    Senyan Xu +7

  41. cs.CV 2026-05-21 reviewed
    Telematics and CV fusion boosts MLLM safety event detection

    Enhancing Multimodal Large Language Models for Safety-Critical Driving Video Analysis

    Tomaso Trinci +2

  42. cs.CV 2026-05-21 reviewed
    Hybrid sampling beats pure uncertainty or diversity in active learning

    Balancing Uncertainty and Diversity of Samples: Leveraging Diversity of Least, High Confidence Samples for Effective Active Learning

    Vipul Arya +4

  43. cs.AI 2026-05-21 reviewed
    Dual selection prunes video tokens while keeping static scenes and changes

    ST-SimDiff: Balancing Spatiotemporal Similarity and Difference for Efficient Video Understanding with MLLMs

    Bingjun Luo +3

  44. cs.CV 2026-05-21 reviewed
    FlowGS speeds up continuous-scale super-resolution for remote sensing

    Flow-based Gaussian Splatting for Continuous-Scale Remote Sensing Image Super-Resolution

    Jiangwei Mo +2

  45. cs.CV 2026-05-21 reviewed
    One sentence becomes a full short drama with AI agents

    One Sentence, One Drama: Personalized Short-Form Drama Generation via Multi-Agent Systems

    Yufei Shi +7

  46. cs.CV 2026-05-21 reviewed
    Event cameras match RGB gait ID in light

    EventGait: Towards Robust Gait Recognition with Event Streams

    Senyan Xu +6

  47. cs.CV 2026-05-21 reviewed
    Swapping ViT attention heads for depthwise convolutions speeds inference 17-20%

    Accelerating Vision Foundation Models with Drop-in Depthwise Convolution

    Carmelo Scribano +7

  48. cs.CV 2026-05-21 reviewed
    Two-stage AI plans then executes fixes for photo flaws

    AesFormer: Transform Everyday Photos into Beautiful Memories

    Tianxiang Du +2

  49. cs.CV 2026-05-21 reviewed
    Diffusion models correct motion in 3D brain MRI

    MotionDPS: Motion-Compensated 3D Brain MRI Reconstruction

    Antonio Ortiz-Gonzalez +3

  50. cs.AI 2026-05-21 reviewed
    MLLMs get personality scores right but ignore video cues half the time

    Perception or Prejudice: Can MLLMs Go Beyond First Impressions of Personality?

    Caixin Kang +10