pith. sign in

archive

Every paper Pith has read. Search by title, abstract, or pith.

9568 papers in cs.CV · page 11

  1. cs.CV 2026-05-19 reviewed
    Real images align diffusion models as well as preference pairs

    When Preference Labels Fall Short: Aligning Diffusion Models from Real Data

    Weiyan Chen +7

  2. cs.CV 2026-05-19 reviewed
    Dual-stream network lifts weather detection at full speed

    CADENet: Condition-Adaptive Asynchronous Dual-Stream Enhancement Network for Adverse Weather Perception in Autonomous Driving

    Sherif Khairy +1

  3. cs.AI 2026-05-19 reviewed
    Temporal conditioning changes AV planner style but not scores

    From Prompts to Pavement Through Time: Temporal Grounding in Agentic Scene-to-Plan Reasoning

    Ahmed Y. Gado +4

  4. cs.CV 2026-05-19 reviewed
    Landmark and language priors raise FER accuracy on three wild datasets

    LaCoVL-FER: Landmark-Guided Contrastive Learning Network with Vision-Language Enhancement for Facial Expression Recognition

    Jiaxin Wang +4

  5. cs.CV 2026-05-19 reviewed
    Stitched model lifts rewards to noisy latents for faster alignment

    Stitched Value Model for Diffusion Alignment

    Hyojun Go +10

  6. cs.CV 2026-05-19 reviewed
    Semi-supervised method reaches 79.99% Dice in fetal heart ultrasound

    Synergistic Foundation Models for Semi-Supervised Fetal Cardiac Ultrasound Analysis: SAM-Med2D Boundary Refinement and DINOv3 Semantic Enhancement

    Tonghao Zhuang (1) +7

  7. cs.CV 2026-05-19 reviewed
    Pose accuracy proxies depth quality without needing ground-truth depth

    Depth2Pose: A Pose-Based Benchmark for Monocular Depth Estimation without Ground-Truth Depth

    Viktor Kocur +6

  8. cs.CV 2026-05-19 reviewed
    VLMs localize objects with boundary tokens

    Mechanisms of Object Localization in Vision-Language Models

    Timothy Schauml\"offel +2

  9. cs.LG 2026-05-19 reviewed
    Class prototypes on the hypersphere reach neural collapse by design

    Neural Collapse by Design: Learning Class Prototypes on the Hypersphere

    Panagiotis Koromilas +3

  10. cs.LG 2026-05-19 reviewed
    Prototypes on the hypersphere reach neural collapse by design

    Neural Collapse by Design: Learning Class Prototypes on the Hypersphere

    Panagiotis Koromilas +3

  11. cs.CV 2026-05-19 reviewed
    Attention chains cut 4D mesh generation to 9 seconds

    Fast 4D Mesh Generation by Spatio-Temporal Attention Chains

    Dvir Samuel +3

  12. cs.CV 2026-05-19 reviewed
    Fused expert preferences and ratings lift VLM aesthetic SRCC to 0.709

    Preferences Order, Ratings Anchor: From Fused Expert Aesthetic Ground Truth to Self-Distillation

    Yuanpei Zhao +7

  13. cs.CV 2026-05-19 reviewed
    Self-distillation raises VLM aesthetic SRCC from 0.504 to 0.709

    Preferences Order, Ratings Anchor: From Fused Expert Aesthetic Ground Truth to Self-Distillation

    Yuanpei Zhao +7

  14. cs.RO 2026-05-19 reviewed
    Training on near-failure paths improves driving safety

    Beyond Imitation: Learning Safe End-to-End Autonomous Driving from Hard Negatives

    Junli Wang +9

  15. cs.CV 2026-05-19 reviewed
    Neuron selection lets VAR models add user concepts without forgetting prior ones

    CPC-VAR:Continual Personalized and Compositional Generation in Visual Autoregressive Models

    Junhao Li +6

  16. cs.CV 2026-05-19 reviewed
    One reference image flags traffic anomalies via embedding matches

    Real-World On-Vehicle Evaluation of Embedding-Based Anomaly Detection

    Albert Schotschneider +4

  17. cs.CV 2026-05-19 reviewed
    Reward optimization erases unwanted concepts in flow models

    FlowErase-RL: Rethinking Concept Erasure as Reward Optimization in Flow Matching Models

    Yi Sun +7

  18. cs.GR 2026-05-19 reviewed
    Browser renders MRI digital twins at 82 FPS on low-cost GPUs

    Decentralized Direct Volume Rendering: A Browser-Native GPU Architecture for MRI Digital Twins in Resource-Constrained Settings

    Oserebameh Augustine Beckley

  19. cs.CV 2026-05-19 reviewed
    Geometry injection enables unaligned optical-SAR retrieval

    GeoMamba: A Geometry-driven MambaVision Framework and Dataset for Fine-grained Optical-SAR Object Retrieval

    Tiantong Fang +5

  20. cs.CV 2026-05-19 reviewed
    Staged distillation keeps tiny diffusion models stable at 1.6 percent teacher size

    LIFT and PLACE: A Simple, Stable, and Effective Knowledge Distillation Framework for Lightweight Diffusion Models

    Hyunsoo Han +2

  21. cs.CV 2026-05-19 reviewed
    Tiny diffusion models reach FID 15.73 with staged distillation

    LIFT and PLACE: A Simple, Stable, and Effective Knowledge Distillation Framework for Lightweight Diffusion Models

    Hyunsoo Han +2

  22. cs.CV 2026-05-19 reviewed
    Frozen probe tunes video models to follow drone inertial commands

    Aero-World: Action-Conditioned Aerial Video Generation from Inertial Controls

    Abdul Mohaimen Al Radi +4

  23. cs.CV 2026-05-19 reviewed
    Tango3D aligns pixels to 3D points while preserving global retrieval

    Tango3D: Towards Alignment for Global and Local 2D-3D Correspondence

    Zebin He +6

  24. cs.CV 2026-05-19 reviewed
    Downsampled block selection speeds up diffusion attention nearly 7x

    Efficient Long-Context Modeling in Diffusion Language Models via Block Approximate Sparse Attention

    Wenhu Zhang +8

  25. cs.CV 2026-05-19 reviewed
    Physics-in-the-loop agents produce more complex valid CAD designs

    Physics-in-the-Loop: A Hybrid Agentic Architecture for Validated CAD Engineering Design

    Elias Berger +4

  26. cs.CV 2026-05-19 reviewed
    Sonar simulator matches real images at texture KL below 0.07

    Physics-informed simulation framework for realistic sonar image generation and statistical validation

    Kamal Basha S +1

  27. cs.CV 2026-05-19 reviewed
    CRP groups medical tasks from text for 73% Dice with 4% forgetting

    MedCRP-CL: Continual Medical Image Segmentation via Bayesian Nonparametric Semantic Modality Discovery

    Ziyuan Gao

  28. cs.CV 2026-05-19 reviewed
    New dataset labels 10k white blood cell images with 11 morphological traits

    WBCAtt+: Fine-Grained Pixel-Level Morphological Annotations for White Blood Cell Images

    Satoshi Tsutsui +3

  29. cs.CV 2026-05-19 reviewed
    Real JPEG tables cut false positives in document forgery detection

    DocQT: Improving Document Forgery Localization Robustness via Diverse JPEG Quantization Tables

    Kylian Ronfleux-Corail (L3I) +3

  30. eess.IV 2026-05-19 reviewed
    TADA adapts steganalysis to unknown JPEG pipelines

    Tackle CSM in JPEG Steganalysis with Data Adaptation

    Rony Abecidan (CRIStAL) +5

  31. cs.CV 2026-05-19 reviewed
    Satellite and ground photos fuse for better outdoor view synthesis

    Cross-View Splatter: Feed-Forward View Synthesis with Georeferenced Images

    Matias Turkulainen +8

  32. cs.CV 2026-05-19 reviewed
    NeRF augmentations train pose estimators from 25 real images

    CAD-Free Learning of Spacecraft Pose Estimators via NeRF-Based Augmentations

    Antoine Legrand +2

  33. cs.CV 2026-05-19 reviewed
    NeRF lets pose estimators train on 25-400 real images

    CAD-Free Learning of Spacecraft Pose Estimators via NeRF-Based Augmentations

    Antoine Legrand +2

  34. cs.CV 2026-05-19 reviewed
    Refiner teaches image models to fix their own mistakes

    Benchmarking and Evolving Reason-Reflect-Rectify for Reflective Visual Generation

    Junjie Wang +10

  35. cs.CV 2026-05-19 reviewed
    Panorama-first split lifts zero-shot navigation success 59 percent

    P2DNav: Panorama-to-Downview Reasoning for Zero-shot Vision-and-Language Navigation

    Kai Sheng +7

  36. cs.RO 2026-05-19 reviewed
    One model drives well across cities and sensors without retraining

    HEAT: Heterogeneous End-to-End Autonomous Driving via Trajectory-Guided World Models

    Hoonhee Cho +5

  37. cs.CV 2026-05-19 reviewed
    Component style transfer closes satellite sim-to-real gap

    Component-Aware Structure-Preserving Style Transfer for Satellite Visual Sim2Real Data Construction

    Zongwu Xie +4

  38. cs.CV 2026-05-19 reviewed
    Part-wise style transfer raises satellite pose accuracy

    Component-Aware Structure-Preserving Style Transfer for Satellite Visual Sim2Real Data Construction

    Zongwu Xie +4

  39. cs.CV 2026-05-19 reviewed
    Few-shot visual prototypes correct misclassifications in text-prompted segmentation

    PrAda: Few-Shot Visual Adaptation for Text-Prompted Segmentation

    Gabriele Rosi +3

  40. cs.CV 2026-05-19 reviewed
    Contrastive registers let ViTs drop spurious tokens and lift segmentation accuracy

    UniRefiner: Teaching Pre-trained ViTs to Self-Dispose Dross via Contrastive Register

    Congpei Qiu +5

  41. cs.CV 2026-05-19 reviewed
    Bézier curves stabilize LiDAR human motion capture

    B\'ezier Degradation Modeling for LiDAR-based Human Motion Capture

    Xiaoqi An +4

  42. cs.CV 2026-05-19 reviewed
    VLM feedback iterates to fix cross-camera color constancy

    White-Balance First, Adjust Later: Cross-Camera Color Constancy via Vision-Language Evaluation

    Shuwei Li +2

  43. cs.CV 2026-05-19 reviewed
    Physics-guided diffusion designs metasurface absorbers in 30 seconds

    Physics Guided Conditional Diffusion Framework for Generative Inverse Design of Manufacturable Metasurface based Absorbers

    Vineetha Joy +5

  44. cs.CV 2026-05-19 reviewed
    SVD-ordered paths yield less noisy model attributions

    Spectral Integrated Gradients for Coarse-to-Fine Feature Attribution

    Soyeon Kim +3

  45. cs.CV 2026-05-19 reviewed
    Datasets enable global tree mortality mapping from aerial imagery

    deadtrees.earth-aerial: A Multi-Resolution Aerial Image Dataset for Tree Cover and Mortality Detection

    Ayushi Sharma +11

  46. cs.CV 2026-05-19 reviewed
    YOLO26-MoE hits 0.99 mAP for spotting insulator faults in drone photos

    A novel YOLO26-MoE optimized by an LLM agent for insulator fault detection considering UAV images

    Jo\~ao Pedro Matos-Carvalho +4

  47. cs.CV 2026-05-19 reviewed
    Laminating film on lenses blocks identity while keeping action cues

    Lens Privacy Sealing: A New Benchmark and Method for Physical Privacy-Preserving Action Recognition

    Mengyuan Liu +3

  48. cs.CV 2026-05-19 reviewed
    Laminating film on lenses hides identities for action recognition

    Lens Privacy Sealing: A New Benchmark and Method for Physical Privacy-Preserving Action Recognition

    Mengyuan Liu +3

  49. cs.CV 2026-05-19 reviewed
    MLLMs often back correct answers with inconsistent egocentric evidence

    EgoCoT-Bench: Benchmarking Grounded and Verifiable Operation-Centric Chain of Thought Reasoning for MLLMs

    Yang Dai +3

  50. cs.CV 2026-05-19 reviewed
    Sparse diffusion cuts redundant matches for steadier camera tracking

    EpiDiffVO: Geometry-Aware Epipolar Diffusion for Robust Visual Odometry

    Prateeth Rao