Mixed citations

Title resolution pending

FirstName LastName , title =

Mixed citation behavior. Most common role is unclear (30%).

20 Pith papers citing it

unclear 30% of classified citations

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

background 3 method 3 baseline 2 other 2

citation-polarity summary

unclear 3 use method 3 background 2 baseline 2

representative citing papers

Beyond Detection: A Structure-Aware Framework for Scene Text Tracking

cs.CV · 2026-05-17 · unverdicted · novelty 7.0

SymTrack is the first systematic detection-free framework for scene text tracking that constructs benchmarks from video text spotting datasets and reports up to 11.97% AUC gains over prior trackers.

HairGPT: Strand-as-Language Autoregressive Modeling for Realistic 3D Hairstyle Synthesis

cs.GR · 2026-05-09 · unverdicted · novelty 7.0

HairGPT reframes 3D hairstyle synthesis as dual-decoupled autoregressive strand sequence modeling with geometric tokenization for semantic control and rare style generation.

Beyond Bag-of-Patches: Learning Global Layout via Textual Supervision for Late-Interaction Visual Document Retrieval

cs.CV · 2026-05-08 · unverdicted · novelty 7.0

A text-supervised global layout embedding augments local patch representations in late-interaction VDR, yielding +2.4 nDCG@5 and +2.3 MAP@5 gains over ColPali/ColQwen baselines on ViDoRe-v2.

Privatar: Scalable Privacy-preserving Multi-user VR via Secure Offloading

cs.CR · 2026-04-19 · unverdicted · novelty 7.0

Privatar uses horizontal frequency partitioning and distribution-aware minimal perturbation to enable private offloading of VR avatar reconstruction, supporting 2.37x more users with modest overhead.

Q-Align: Teaching LMMs for Visual Scoring via Discrete Text-Defined Levels

cs.CV · 2023-12-28 · conditional · novelty 7.0

Q-Align trains LMMs on discrete text-defined levels for visual scoring, achieving SOTA on IQA, IAA, and VQA while unifying the tasks in OneAlign.

Low Latency Gaze Tracking via Latent Optical Sensing

cs.CV · 2026-05-18 · unverdicted · novelty 6.0

A hardware prototype performs gaze estimation by optically encoding task-relevant features with a microlens array and mask, captured on a 4x4 phototransistor array and decoded by a small neural network, reaching 3.4 ms latency with competitive accuracy.

Prognostic Value of Lung Ultrasound Biomarkers for Readmission Risk in Congestive Heart Failure: A Pilot Data-Driven Analysis

eess.SP · 2026-05-16 · unverdicted · novelty 6.0

Pilot study uses pretrained video encoder features from lung ultrasound to predict 30-day CHF readmission, finding lower-lung views and temporal differences most informative with top MLP F1 of 0.80.

DocAtlas: Multilingual Document Understanding Across 80+ Languages

cs.CL · 2026-05-12 · unverdicted · novelty 6.0

DocAtlas introduces model-free rendering pipelines to create DocTag-annotated datasets across 82 languages and shows DPO adaptation improves multilingual performance without base-language degradation.

Video Prediction Policy: A Generalist Robot Policy with Predictive Visual Representations

cs.CV · 2024-12-19 · unverdicted · novelty 6.0

Video Prediction Policy conditions robot action learning on future-frame predictions inside fine-tuned video diffusion models, yielding 18.6% relative gains on Calvin ABC-D and 31.6% higher real-world success rates.

Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

cs.CV · 2024-03-05 · conditional · novelty 6.0

Biased noise sampling for rectified flows combined with a bidirectional text-image transformer architecture yields state-of-the-art high-resolution text-to-image results that scale predictably with model size.

Beyond Instance-Level Self-Supervision in 3D Multi-Modal Medical Imaging

cs.CV · 2026-05-14 · unverdicted · novelty 5.0

A self-supervised approach uses consistent spatial relationships of anatomical structures across patients to improve 3D multi-modal medical image representations, yielding modest gains on segmentation and classification tasks.

UAV-Assisted Scan-to-Simulation for Landslides Using Physics-Informed Gaussian Splatting

cs.CV · 2026-05-11 · unverdicted · novelty 5.0

A UAV-to-3DGS-to-MPM pipeline reconstructs real landslide sites with photorealistic visuals and runs physics-based simulations, validated on a Hong Kong event.

Intrinsic Gradient Suppression for Label-Noise Prompt Tuning in Vision-Language Models

cs.CV · 2026-05-01 · unverdicted · novelty 5.0

Double-Softmax Prompt Tuning uses sequential softmax normalization to create self-adaptive gradient saturation that filters noisy samples while preserving useful updates in CLIP prompt tuning.

Group Cognition Learning: Making Everything Better Through Governed Two-Stage Agents Collaboration

cs.LG · 2026-05-01 · unverdicted · novelty 4.0 · 2 refs

Group Cognition Learning uses governed two-stage agents after separate modality encoding to mitigate dominance and spurious coupling, reporting state-of-the-art results on CMU-MOSI, CMU-MOSEI, and MIntRec for regression and classification.

A Robust Semantic Segmentation Pipeline for the CVPR 2026 8th UG2+ Challenge Track 2

cs.CV · 2026-05-21 · unverdicted · novelty 2.0 · 2 refs

A semi-supervised pipeline applies UniMatch V2 to the WeatherProof dataset by treating degraded images as unlabeled data plus test-time augmentation for semantic segmentation in adverse weather.

Low-Cost Neural Radiance Fields

cs.CV · 2026-05-10 · unverdicted · novelty 2.0

Comparative study of DS-NeRF, TensoRF, and HashNeRF with depth-supervision and architectural variants finds no conclusive outperformance under equal training time but identifies which design choices transfer to low-data, low-compute regimes.

Bad Seeing or Bad Thinking? Rewarding Perception for Multimodal Reasoning

cs.AI · 2026-05-13

LIVEditor-14B: Lightning Unified Video Editing via In-Context Sparse Attention

cs.CV · 2026-05-06

Active Sampling for Ultra-Low-Bit-Rate Video Compression via Conditional Controlled Diffusion

cs.CV · 2026-05-04

Dual-Anchoring: Addressing State Drift in Vision-Language Navigation

cs.CV · 2026-04-19

citing papers explorer

Showing 20 of 20 citing papers.

Beyond Detection: A Structure-Aware Framework for Scene Text Tracking cs.CV · 2026-05-17 · unverdicted · none · ref 14
SymTrack is the first systematic detection-free framework for scene text tracking that constructs benchmarks from video text spotting datasets and reports up to 11.97% AUC gains over prior trackers.
HairGPT: Strand-as-Language Autoregressive Modeling for Realistic 3D Hairstyle Synthesis cs.GR · 2026-05-09 · unverdicted · none · ref 1
HairGPT reframes 3D hairstyle synthesis as dual-decoupled autoregressive strand sequence modeling with geometric tokenization for semantic control and rare style generation.
Beyond Bag-of-Patches: Learning Global Layout via Textual Supervision for Late-Interaction Visual Document Retrieval cs.CV · 2026-05-08 · unverdicted · none · ref 1
A text-supervised global layout embedding augments local patch representations in late-interaction VDR, yielding +2.4 nDCG@5 and +2.3 MAP@5 gains over ColPali/ColQwen baselines on ViDoRe-v2.
Privatar: Scalable Privacy-preserving Multi-user VR via Secure Offloading cs.CR · 2026-04-19 · unverdicted · none · ref 244
Privatar uses horizontal frequency partitioning and distribution-aware minimal perturbation to enable private offloading of VR avatar reconstruction, supporting 2.37x more users with modest overhead.
Q-Align: Teaching LMMs for Visual Scoring via Discrete Text-Defined Levels cs.CV · 2023-12-28 · conditional · none · ref 1
Q-Align trains LMMs on discrete text-defined levels for visual scoring, achieving SOTA on IQA, IAA, and VQA while unifying the tasks in OneAlign.
Low Latency Gaze Tracking via Latent Optical Sensing cs.CV · 2026-05-18 · unverdicted · none · ref 1
A hardware prototype performs gaze estimation by optically encoding task-relevant features with a microlens array and mask, captured on a 4x4 phototransistor array and decoded by a small neural network, reaching 3.4 ms latency with competitive accuracy.
Prognostic Value of Lung Ultrasound Biomarkers for Readmission Risk in Congestive Heart Failure: A Pilot Data-Driven Analysis eess.SP · 2026-05-16 · unverdicted · none · ref 1
Pilot study uses pretrained video encoder features from lung ultrasound to predict 30-day CHF readmission, finding lower-lung views and temporal differences most informative with top MLP F1 of 0.80.
DocAtlas: Multilingual Document Understanding Across 80+ Languages cs.CL · 2026-05-12 · unverdicted · none · ref 1
DocAtlas introduces model-free rendering pipelines to create DocTag-annotated datasets across 82 languages and shows DPO adaptation improves multilingual performance without base-language degradation.
Video Prediction Policy: A Generalist Robot Policy with Predictive Visual Representations cs.CV · 2024-12-19 · unverdicted · none · ref 1
Video Prediction Policy conditions robot action learning on future-frame predictions inside fine-tuned video diffusion models, yielding 18.6% relative gains on Calvin ABC-D and 31.6% higher real-world success rates.
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis cs.CV · 2024-03-05 · conditional · none · ref 1
Biased noise sampling for rectified flows combined with a bidirectional text-image transformer architecture yields state-of-the-art high-resolution text-to-image results that scale predictably with model size.
Beyond Instance-Level Self-Supervision in 3D Multi-Modal Medical Imaging cs.CV · 2026-05-14 · unverdicted · none · ref 1
A self-supervised approach uses consistent spatial relationships of anatomical structures across patients to improve 3D multi-modal medical image representations, yielding modest gains on segmentation and classification tasks.
UAV-Assisted Scan-to-Simulation for Landslides Using Physics-Informed Gaussian Splatting cs.CV · 2026-05-11 · unverdicted · none · ref 1
A UAV-to-3DGS-to-MPM pipeline reconstructs real landslide sites with photorealistic visuals and runs physics-based simulations, validated on a Hong Kong event.
Intrinsic Gradient Suppression for Label-Noise Prompt Tuning in Vision-Language Models cs.CV · 2026-05-01 · unverdicted · none · ref 1
Double-Softmax Prompt Tuning uses sequential softmax normalization to create self-adaptive gradient saturation that filters noisy samples while preserving useful updates in CLIP prompt tuning.
Group Cognition Learning: Making Everything Better Through Governed Two-Stage Agents Collaboration cs.LG · 2026-05-01 · unverdicted · none · ref 1 · 2 links
Group Cognition Learning uses governed two-stage agents after separate modality encoding to mitigate dominance and spurious coupling, reporting state-of-the-art results on CMU-MOSI, CMU-MOSEI, and MIntRec for regression and classification.
A Robust Semantic Segmentation Pipeline for the CVPR 2026 8th UG2+ Challenge Track 2 cs.CV · 2026-05-21 · unverdicted · none · ref 1 · 2 links
A semi-supervised pipeline applies UniMatch V2 to the WeatherProof dataset by treating degraded images as unlabeled data plus test-time augmentation for semantic segmentation in adverse weather.
Low-Cost Neural Radiance Fields cs.CV · 2026-05-10 · unverdicted · none · ref 1
Comparative study of DS-NeRF, TensoRF, and HashNeRF with depth-supervision and architectural variants finds no conclusive outperformance under equal training time but identifies which design choices transfer to low-data, low-compute regimes.
Bad Seeing or Bad Thinking? Rewarding Perception for Multimodal Reasoning cs.AI · 2026-05-13 · unreviewed · ref 1
LIVEditor-14B: Lightning Unified Video Editing via In-Context Sparse Attention cs.CV · 2026-05-06 · unreviewed · ref 1
Active Sampling for Ultra-Low-Bit-Rate Video Compression via Conditional Controlled Diffusion cs.CV · 2026-05-04 · unreviewed · ref 1
Dual-Anchoring: Addressing State Drift in Vision-Language Navigation cs.CV · 2026-04-19 · unreviewed · ref 116

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer