pith. sign in

archive

Every paper Pith has read. Search by title, abstract, or pith.

9568 papers in cs.CV · page 2

  1. cs.LG 2026-05-22 reviewed
    Multi-view probes read model weights more accurately

    What Linear Probes Miss: Multi-View Probing for Weight-Space Learning

    Eunwoo Heo +2

  2. cs.CV 2026-05-22 reviewed
    3D CNNs spot and name hand gestures in live video

    Online Hand Gesture Recognition Using 3D Convolutional Neural Networks

    Yinghao Qin +1

  3. cs.CV 2026-05-22 reviewed
    Roadside LiDAR generates vehicle data to improve detection

    RS2AD-LiDAR: End-to-End Autonomous Driving LiDAR Data Generation from Roadside Sensor Observations

    Runyi Huang +5

  4. cs.CV 2026-05-22 reviewed
    Deep correspondences jointly calibrate camera intrinsics and LiDAR extrinsics

    Joint Target-Less Intrinsic and Extrinsic Camera-LiDAR Calibration using Deep Point Correspondences

    Simon Bultmann +2

  5. cs.CV 2026-05-22 reviewed
    Velocity split accelerates flow models without training

    VDE: Training-Free Accelerating Rectified Flow Model via Velocity Decomposition and Estimation

    Junwen Tan +3

  6. cs.CV 2026-05-22 reviewed
    Per-pixel module confines FPS weapon actions to local scope

    SCOPE: Simulating Cross-game Operations in Playable Environments for FPS World Models

    Zizhao Tong +13

  7. cs.CV 2026-05-22 reviewed
    Uncertainty gate activates contrastive decoding only on risky tokens

    CHASD: Language Increment-Calibrated Contrastive Decoding against Hallucination in LVLMs

    Xiaoyi Huang +2

  8. cs.CV 2026-05-22 reviewed
    Geometry measure added to lane filtering keeps accurate lines

    GFSR: Geometric Fidelity and Spatial Refinement for Reliable Lane Detection

    Tiancheng Wang +7

  9. cs.CV 2026-05-22 reviewed
    Hybrid quantum models raise blood cell F1 scores by up to 3.7%

    Enhancing Blood Cells Classification using Hybrid Quantum Neural Networks

    Guilherme Cruz +4

  10. eess.IV 2026-05-22 reviewed
    EF-LIC skips entropy coding yet matches its performance

    Efficient Learned Image Compression without Entropy Coding

    Hao Cao +3

  11. cs.CV 2026-05-22 reviewed
  12. cs.CV 2026-05-22 reviewed
    Dense 4D volumes preserve local cues for video action recognition

    Spatio-Temporal Similarity Volume Aggregation for Open-Vocabulary Action Recognition

    Yerim So +3

  13. cs.CV 2026-05-22 reviewed
    Feed-forward model creates language-labeled 3D scenes from sparse photos

    LangFlash: Feed-forward 3D Language Gaussian Splatting from Sparse Unposed Images

    Yilong Liu +3

  14. eess.IV 2026-05-22 reviewed
    Neural operator deblurs varying blur in pathology slides

    Discontinuous Galerkin Neural Operator for Pathology Defocus Deblurring

    Shaoqing Duan +4

  15. cs.CV 2026-05-22 reviewed
    Vision-language agent picks depth experts per sample

    DepthAgent: Towards Better Universal Depth Estimation via Sample-wise Expert Selection

    Jie Zhu +2

  16. cs.CV 2026-05-22 reviewed
    Unified engine retrieves events from large video sets consistently

    U-CESE: Unified Clip-based Event Search Engine for AI Challenge HCMC 2025

    Duc-Nhuan Le +4

  17. cs.CV 2026-05-22 reviewed
    EvalVerse calibrates VLMs to expert cinematic video standards

    EvalVerse: Pipeline-Aware and Expert-Calibrated Benchmarking for Professional Cinematic Video Generation

    Songlin Yang +25

  18. cs.CV 2026-05-22 reviewed
    Hybrid planner reaches 94.85 on NAVSIM

    ChainFlow-VLA: Causal Flow Planning with Vision-Language Models

    Xiyang Wang +9

  19. cs.CV 2026-05-22 reviewed
    Coloring noise in Sobolev space fixes SR spectral mismatch

    Coloring the Noise: Adversarial Sobolev Alignment for Faithful Image Super Resolution

    Hongbo Wang +5

  20. cs.RO 2026-05-22 reviewed
    Convex hull of historical prompts bridges new VLN domains

    Turning Adaptation into Assets: Cross-Domain Bridging for Online Vision-Language Navigation

    Zixuan Hu +5

  21. cs.CV 2026-05-22 reviewed
    Consensus method improves noisy label correction for rare classes

    CARE: Class-Adaptive Expert Consensus for Reliable Learning with Long-Tailed Noisy Labels

    Mengke Li +5

  22. cs.CV 2026-05-22 reviewed
    Single-frame edit extends across video via diffusion priors

    SimInsert: Seamless Video Object Insertion via Regional Sparse Attention Fusion

    Xinyu Chen +11

  23. cs.CV 2026-05-22 reviewed
    Benchmark supplies multi-baseline stereo pairs with full calibration

    StereoGenBench: A Synthetic Multi-Camera Benchmark for Stereo Generation under Controlled Baseline Regimes

    Yangzhi Cui +2

  24. cs.CV 2026-05-22 reviewed
    IDEAL detects anomalies from both normal and anomalous few-shot examples

    Beyond Normal References: Discriminative Few-Shot Anomaly Detection

    Huan Wang +3

  25. cs.CV 2026-05-22 reviewed
    Benchmark shows VLMs fail at tracing causal chains in video

    CaST-Bench: Benchmarking Causal Chain-Grounded Spatio-Temporal Reasoning for Video Question Answering

    Mingfang Zhang +9

  26. cs.CV 2026-05-22 reviewed
    Homography mapping yields linear bounds for camera motion verification

    Lipschitz Optimization for Formal Verification of Homographies

    Jean-Guillaume Durand +3

  27. cs.CV 2026-05-22 reviewed
    Physics-semantic keyframe scoring fixes occluded video editing

    Occlusion-Aware Physics-Semantic Keyframe Selection for Robust Video Editing

    Lin Liu +6

  28. cs.CV 2026-05-22 reviewed
    VLMs reach only 5.5% success on implicit intent navigation

    IntentionNav: A Benchmark for Intent-Driven Object Navigation from Implicit Human Instruction

    Lin Qian +6

  29. eess.IV 2026-05-22 reviewed
    GMENet generates missing MRI to expand usable glioma data by 97%

    GMENet: Generative Mixture of Experts Network for Multi-Center Glioma Diagnosis with Incomplete Imaging Sequences

    Pengfei Song +7

  30. cs.CV 2026-05-22 reviewed
    Joint pose and image prediction improves multi-person scene accuracy

    Composing People Together: Iterative Pose-Image Generation for Multi-Person Interaction Scenes

    Wenxuan Peng +2

  31. cs.CV 2026-05-22 reviewed
    VLMs trail humans by 28.4 points on driving scenes benchmark

    DRIVESPATIAL: A Benchmark for Spatiotemporal Intelligence in VLMs for Autonomous Driving

    Hao Vo +12

  32. cs.CV 2026-05-22 reviewed
    Quantized labels cut rPPG model size 88% and raise speed 191%

    LQ-rPPG: A Label-Quantized Coarse-to-Fine Learning Framework for Remote Physiological Measurement

    Jun Seong Lee +3

  33. cs.RO 2026-05-22 reviewed
    Semantic cues speed drone exploration 13.7 times on average

    Semantic-Aware Guided Drone Exploration for Language-Conditioned 3D Indoor Mapping

    Nitin Vegesna +1

  34. cs.CV 2026-05-22 reviewed
    Attributes replace category lists for remote sensing pre-training

    SLIP-RS: Structured-Attribute Language-Image Pre-Training for Remote Sensing Object Detection

    Chenxu Wang +5

  35. cs.CV 2026-05-22 reviewed
    VLMs fail to infer visual relations from examples

    VisAnalog: A Diagnostic Suite for Visual Concept Transfer on Natural Images

    Zhaonan Li +15

  36. eess.IV 2026-05-22 reviewed
    EEG model reaches 34.5% top-1 accuracy in 200-way image retrieval

    STAMBRIDGE: Spectral-Temporal Amplitude-aware Mid-Feature Bridge for EEG Visual Decoding

    Jiahe Meng +7

  37. cs.CV 2026-05-22 reviewed
    Verified prompts plus longitudinal context raise lesion tracking Dice by 4.5 points

    Exploiting Longitudinal Context in Clinician-Verified Interactive Lesion Tracking

    Yannick Kirchhoff +7

  38. cs.CV 2026-05-22 reviewed
    One frozen VLM detects video anomalies without training

    CoReVAD: A Contextual Reasoning Framework for Training-Free Video Anomaly Detection

    Hyeongmuk Lim +1

  39. cs.CV 2026-05-22 reviewed
    Schrödinger Bridge raises deepfake AP@0.95 by 3-10%

    Inconsistency-aware Multimodal Schr\"odinger Bridge for Deepfake Localization

    Jiayu Xiong +4

  40. eess.IV 2026-05-21 reviewed
    Synthetic MRIs raise accuracy for one tumour classifier by 1.02%

    Do Synthetic Brain MRIs Reliably Improve Tumour Classification? A StyleGAN2-ADA Class-Plane Augmentation Study on BRISC 2025

    Jos\'e Rafael Noriega Cede\~no

  41. cs.CV 2026-05-21 reviewed
    Velocity mismatches flag anomalies in flow matching models

    Flow Mismatching: Unsupervised Anomaly Detection via Velocity Discrepancies in Flow Matching Models

    Shengzhe Chen +3

  42. cs.CV 2026-05-21 reviewed
    This paper introduces RoboSurg-VQA

    RoboSurg-VQA: A Multimodal Benchmark for Surgical Segmentation-Aware Visual Question Answering

    Chengyi Zhang +2

  43. cs.CV 2026-05-21 reviewed
    Dithering defends vision models against adversarial attacks

    Dithering Defense: Adversarial Robustness of Vision Foundation Models via Multi-Level Floyd-Steinberg Dithering

    Yury Belousov +3

  44. cs.CV 2026-05-21 reviewed
    Vertex weights let mmWave data drive accurate SMPL body fits

    Millimeter-wave Imaging for Anthropometric Body Measurement

    Miriam Senne +4

  45. cs.CV 2026-05-21 reviewed
    Motion data alone rivals video models trained on 10000x more examples

    The TIME Machine: On The Power of Motion for Efficient Perception

    Mantas Skackauskas +2

  46. cs.LG 2026-05-21 reviewed
    RADAR forecasts transfer by comparing representation trajectories

    RADAR: Relative Angular Divergence Across Representations

    Xavier Cadet +2

  47. cs.CV 2026-05-21 reviewed
    Reconstructed maps raise 3D detection scores without manual HD maps

    Scene Reconstruction as Mapping Priors for 3D Detection

    Yang Fu +10

  48. cs.CV 2026-05-21 reviewed
    Binary masks control precise motion in generated videos

    CoMoGen: COntrollable MOtion Dynamics and Interactions with Mask-Guided Video GENeration

    Adil Meric +5

  49. cs.CV 2026-05-21 reviewed
    Toolkit automates annotation of child-caregiver eye-tracking videos

    GazeBehavior Annotation Toolkit (GBAT): AI-powered toolkit for automatic annotation of egocentric eye-tracking and video data of child-caregiver interaction

    Iba Baig +7

  50. cs.CV 2026-05-21 reviewed
    Pixel prior from QueryMLP lifts buoy association to 0.7386

    Improved Vision-to-Chart Buoy Association with Learned World-to-Image Projection

    Borja Carrillo-Perez (Arquimea Research Center)