super hub Canonical reference

ImageBind One Embedding Space to Bind Them All

doi: 10 · 2023 · arXiv 2729.2023

Canonical reference. 76% of citing Pith papers cite this work as background.

119 Pith papers citing it

Background 76% of classified citations

read on arXiv browse 119 citing papers more from doi: 10

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 36 baseline 7 method 4 dataset 2

citation-polarity summary

background 37 baseline 7 use method 4 use dataset 1

authors

doi: 10

co-cited works

representative citing papers

RS2AD-LiDAR: End-to-End Autonomous Driving LiDAR Data Generation from Roadside Sensor Observations

cs.CV · 2026-05-22 · unverdicted · novelty 7.0

RS2AD-LiDAR reconstructs vehicle LiDAR data from roadside observations via coordinate transformation, virtual LiDAR modeling and resampling, claimed as the first such method, with experiments showing improved object detection when mixed with real data.

3D LULC classification using multispectral LiDAR and deep learning: current and prospective schemes

cs.CV · 2026-05-21 · conditional · novelty 7.0

Introduces NMCA-aligned L1/L2 LULC schemes and the Loosdorf-MSL benchmark dataset, with Point Transformer V3 reaching 79.4% mIoU on 8 classes and 58.9% on 20 classes, plus gains from multispectral inputs.

AgroVG: A Large-Scale Multi-Source Benchmark for Agricultural Visual Grounding

cs.CV · 2026-05-21 · accept · novelty 7.0

AgroVG is a new multi-source benchmark for agricultural visual grounding formulated as generalized set prediction, with protocols for box and mask grounding across single-target, multi-target, and target-absent queries from six object families.

Oracle Supervision Transfers for Hyperparameter Prediction in Model-Based Image Denoising

cs.CV · 2026-05-19 · conditional · novelty 7.0

HyperDn is a configuration-conditioned predictor that transfers oracle supervision across denoising paradigms to achieve near-oracle hyperparameter prediction with few or zero target labels.

Preferences Order, Ratings Anchor: From Fused Expert Aesthetic Ground Truth to Self-Distillation

cs.CV · 2026-05-19 · conditional · novelty 7.0 · 4 refs

PPaint fuses expert pairwise preferences and ratings into ground truth; PSDistill converts VLM pairwise judgments into calibrated pseudo-scores via Elo and trains the same VLM to produce a single-pass aesthetic scorer that improves SRCC across categories.

LMM-Track4D: Eliciting 4D Dynamic Reasoning in LMMs via Trajectory-Grounded Dialogue

cs.CV · 2026-05-19 · unverdicted · novelty 7.0

LMM-Track4D formulates a trajectory-grounded dialogue task, releases Track4D-Bench with 526 samples, and proposes RTGE encoding, TRK state token, and OSK-RA decoder to elicit better 4D spatiotemporal reasoning in LMMs.

PluRule: A Benchmark for Moderating Pluralistic Communities on Social Media

cs.CL · 2026-05-16 · unverdicted · novelty 7.0 · 4 refs

PluRule is a new multimodal multilingual benchmark showing that state-of-the-art vision-language models perform only marginally better than a trivial baseline at detecting specific rule violations in pluralistic online communities.

Vector Scaffolding: Inter-Scale Orchestration for Differentiable Image Vectorization

cs.CV · 2026-05-12 · unverdicted · novelty 7.0

Vector Scaffolding uses Interior Gradient Aggregation, Progressive Stratification, and Rapid Inflation Scheduling to achieve 2.5x faster optimization and up to 1.4 dB higher PSNR in differentiable vectorization.

Martingale-Consistent Self-Supervised Learning

cs.LG · 2026-05-12 · unverdicted · novelty 7.0

The paper develops a martingale-consistent SSL framework enforcing expected coherence between coarse and refined predictions via new objectives and a Monte Carlo estimator, improving robustness under partial observations.

Geometrically Approximated Modeling for Emitter-Centric Ray-Triangle Filtering in Arbitrarily Dynamic LiDAR Simulation

cs.GR · 2026-05-11 · unverdicted · novelty 7.0

GRCA uses emitter-centric geometric culling of rays per triangle to accelerate LiDAR simulation in arbitrarily dynamic scenes, reporting up to 14.55x speedup over Embree and 7.97x over OptiX.

AnomalyClaw: A Universal Visual Anomaly Detection Agent via Tool-Grounded Refutation

cs.CV · 2026-05-11 · conditional · novelty 7.0

AnomalyClaw turns single-step VLM anomaly judgments into a multi-round tool-grounded refutation process, delivering consistent macro-AUROC gains of 3.5-7.9 percentage points over direct inference across 12 cross-domain datasets.

Field-Localized Forgery Detection for Digital Identity Documents

cs.CV · 2026-05-09 · unverdicted · novelty 7.0 · 2 refs

FLiD is a field-localized forgery detection method for identity documents that outperforms full-document baselines and general detectors with significantly fewer parameters.

ProtoSSL: Interpretable Prototype Learning from Unlabeled Time-Series Data

cs.LG · 2026-05-07 · unverdicted · novelty 7.0

ProtoSSL discovers generalizable prototypes from unlabeled time-series via self-supervision and assigns them to new tasks for interpretable predictions, outperforming supervised baselines in low-data regimes on ECG datasets.

Does it Really Count? Assessing Semantic Grounding in Text-Guided Class-Agnostic Counting

cs.CV · 2026-05-04 · unverdicted · novelty 7.0

Text-guided class-agnostic counting models exhibit significant weaknesses in grounding textual prompts to visual objects, as demonstrated by new negative-label and distractor tests on a multi-category dataset.

ESARBench: A Benchmark for Agentic UAV Embodied Search and Rescue

cs.RO · 2026-05-02 · unverdicted · novelty 7.0

ESARBench is the first unified benchmark for MLLM-driven UAV agents that must explore, locate clues, and decide on victim positions in photorealistic simulated SAR environments.

LoopCTR: Unlocking the Loop Scaling Power for Click-Through Rate Prediction

cs.IR · 2026-04-21 · unverdicted · novelty 7.0

LoopCTR trains CTR models with recursive layer reuse and process supervision so that zero-loop inference outperforms baselines on public and industrial datasets.

Divide-and-Conquer Approach to Holistic Cognition in High-Similarity Contexts with Limited Data

cs.CV · 2026-04-21 · unverdicted · novelty 7.0 · 2 refs

DHCNet improves ultra-fine-grained visual categorization by progressively building holistic cognition from local discrepancies using self-shuffling and refinement on limited data.

BasketHAR: A Multimodal Dataset for Human Activity Recognition and Sport Analysis in Basketball Training Scenarios

cs.CV · 2026-04-18 · conditional · novelty 7.0

BasketHAR is a publicly released multimodal dataset of professional basketball training activities captured with inertial sensors, physiological signals, and video, accompanied by a baseline alignment method.

Efficient Video Diffusion Models: Advancements and Challenges

cs.CV · 2026-04-17 · unverdicted · novelty 7.0

A survey that groups efficient video diffusion methods into four paradigms—step distillation, efficient attention, model compression, and cache/trajectory optimization—and outlines open challenges for practical use.

DiV-INR: Extreme Low-Bitrate Diffusion Video Compression with INR Conditioning

eess.IV · 2026-04-09 · unverdicted · novelty 7.0

DiV-INR integrates implicit neural representations as conditioning signals for diffusion models to achieve better perceptual quality than HEVC, VVC, and prior neural codecs at extremely low bitrates under 0.05 bpp.

On the Decompositionality of Neural Networks

cs.LO · 2026-04-09 · unverdicted · novelty 7.0

Neural decompositionality is defined via decision-boundary semantic preservation, and language transformers largely satisfy it under SAVED while vision models often do not.

Revealing Physical-World Semantic Vulnerabilities: Universal Adversarial Patches for Infrared Vision-Language Models

cs.CV · 2026-04-03 · unverdicted · novelty 7.0

UCGP is a universal physical adversarial patch that compromises cross-modal semantic alignment in IR-VLMs through curved-grid parameterization and representation-space disruption.

LumaFlux: Lifting 8-Bit Worlds to HDR Reality with Physically-Guided Diffusion Transformers

cs.CV · 2026-04-03 · unverdicted · novelty 7.0

LumaFlux is a physically and perceptually guided diffusion transformer for SDR-to-HDR conversion that introduces PGA, PCM, and HDR Residual Coupler modules plus a new training corpus and benchmark, outperforming prior ITM methods.

CORP: Closed-Form One-shot Representation-Preserving Structured Pruning for Transformers

cs.LG · 2026-02-05 · unverdicted · novelty 7.0

CORP performs one-shot structured pruning of Transformers by modeling removed components as affine functions of retained ones and solving closed-form ridge regressions on calibration data to fold compensation into weights, retaining 83.27% Top-1 accuracy on DeiT-Huge after 50% pruning.

citing papers explorer

Showing 50 of 119 citing papers.

RS2AD-LiDAR: End-to-End Autonomous Driving LiDAR Data Generation from Roadside Sensor Observations cs.CV · 2026-05-22 · unverdicted · none · ref 17
RS2AD-LiDAR reconstructs vehicle LiDAR data from roadside observations via coordinate transformation, virtual LiDAR modeling and resampling, claimed as the first such method, with experiments showing improved object detection when mixed with real data.
3D LULC classification using multispectral LiDAR and deep learning: current and prospective schemes cs.CV · 2026-05-21 · conditional · none · ref 31
Introduces NMCA-aligned L1/L2 LULC schemes and the Loosdorf-MSL benchmark dataset, with Point Transformer V3 reaching 79.4% mIoU on 8 classes and 58.9% on 20 classes, plus gains from multispectral inputs.
AgroVG: A Large-Scale Multi-Source Benchmark for Agricultural Visual Grounding cs.CV · 2026-05-21 · accept · none · ref 7
AgroVG is a new multi-source benchmark for agricultural visual grounding formulated as generalized set prediction, with protocols for box and mask grounding across single-target, multi-target, and target-absent queries from six object families.
Oracle Supervision Transfers for Hyperparameter Prediction in Model-Based Image Denoising cs.CV · 2026-05-19 · conditional · none · ref 22
HyperDn is a configuration-conditioned predictor that transfers oracle supervision across denoising paradigms to achieve near-oracle hyperparameter prediction with few or zero target labels.
Preferences Order, Ratings Anchor: From Fused Expert Aesthetic Ground Truth to Self-Distillation cs.CV · 2026-05-19 · conditional · none · ref 10 · 4 links
PPaint fuses expert pairwise preferences and ratings into ground truth; PSDistill converts VLM pairwise judgments into calibrated pseudo-scores via Elo and trains the same VLM to produce a single-pass aesthetic scorer that improves SRCC across categories.
LMM-Track4D: Eliciting 4D Dynamic Reasoning in LMMs via Trajectory-Grounded Dialogue cs.CV · 2026-05-19 · unverdicted · none · ref 26
LMM-Track4D formulates a trajectory-grounded dialogue task, releases Track4D-Bench with 526 samples, and proposes RTGE encoding, TRK state token, and OSK-RA decoder to elicit better 4D spatiotemporal reasoning in LMMs.
PluRule: A Benchmark for Moderating Pluralistic Communities on Social Media cs.CL · 2026-05-16 · unverdicted · none · ref 102 · 4 links
PluRule is a new multimodal multilingual benchmark showing that state-of-the-art vision-language models perform only marginally better than a trivial baseline at detecting specific rule violations in pluralistic online communities.
Vector Scaffolding: Inter-Scale Orchestration for Differentiable Image Vectorization cs.CV · 2026-05-12 · unverdicted · none · ref 11
Vector Scaffolding uses Interior Gradient Aggregation, Progressive Stratification, and Rapid Inflation Scheduling to achieve 2.5x faster optimization and up to 1.4 dB higher PSNR in differentiable vectorization.
Martingale-Consistent Self-Supervised Learning cs.LG · 2026-05-12 · unverdicted · none · ref 2
The paper develops a martingale-consistent SSL framework enforcing expected coherence between coarse and refined predictions via new objectives and a Monte Carlo estimator, improving robustness under partial observations.
Geometrically Approximated Modeling for Emitter-Centric Ray-Triangle Filtering in Arbitrarily Dynamic LiDAR Simulation cs.GR · 2026-05-11 · unverdicted · none · ref 47
GRCA uses emitter-centric geometric culling of rays per triangle to accelerate LiDAR simulation in arbitrarily dynamic scenes, reporting up to 14.55x speedup over Embree and 7.97x over OptiX.
AnomalyClaw: A Universal Visual Anomaly Detection Agent via Tool-Grounded Refutation cs.CV · 2026-05-11 · conditional · none · ref 8
AnomalyClaw turns single-step VLM anomaly judgments into a multi-round tool-grounded refutation process, delivering consistent macro-AUROC gains of 3.5-7.9 percentage points over direct inference across 12 cross-domain datasets.
Field-Localized Forgery Detection for Digital Identity Documents cs.CV · 2026-05-09 · unverdicted · none · ref 10 · 2 links
FLiD is a field-localized forgery detection method for identity documents that outperforms full-document baselines and general detectors with significantly fewer parameters.
ProtoSSL: Interpretable Prototype Learning from Unlabeled Time-Series Data cs.LG · 2026-05-07 · unverdicted · none · ref 30
ProtoSSL discovers generalizable prototypes from unlabeled time-series via self-supervision and assigns them to new tasks for interpretable predictions, outperforming supervised baselines in low-data regimes on ECG datasets.
Does it Really Count? Assessing Semantic Grounding in Text-Guided Class-Agnostic Counting cs.CV · 2026-05-04 · unverdicted · none · ref 54
Text-guided class-agnostic counting models exhibit significant weaknesses in grounding textual prompts to visual objects, as demonstrated by new negative-label and distractor tests on a multi-category dataset.
ESARBench: A Benchmark for Agentic UAV Embodied Search and Rescue cs.RO · 2026-05-02 · unverdicted · none · ref 42
ESARBench is the first unified benchmark for MLLM-driven UAV agents that must explore, locate clues, and decide on victim positions in photorealistic simulated SAR environments.
LoopCTR: Unlocking the Loop Scaling Power for Click-Through Rate Prediction cs.IR · 2026-04-21 · unverdicted · none · ref 27
LoopCTR trains CTR models with recursive layer reuse and process supervision so that zero-loop inference outperforms baselines on public and industrial datasets.
Divide-and-Conquer Approach to Holistic Cognition in High-Similarity Contexts with Limited Data cs.CV · 2026-04-21 · unverdicted · none · ref 6 · 2 links
DHCNet improves ultra-fine-grained visual categorization by progressively building holistic cognition from local discrepancies using self-shuffling and refinement on limited data.
BasketHAR: A Multimodal Dataset for Human Activity Recognition and Sport Analysis in Basketball Training Scenarios cs.CV · 2026-04-18 · conditional · none · ref 7
BasketHAR is a publicly released multimodal dataset of professional basketball training activities captured with inertial sensors, physiological signals, and video, accompanied by a baseline alignment method.
Efficient Video Diffusion Models: Advancements and Challenges cs.CV · 2026-04-17 · unverdicted · none · ref 186
A survey that groups efficient video diffusion methods into four paradigms—step distillation, efficient attention, model compression, and cache/trajectory optimization—and outlines open challenges for practical use.
DiV-INR: Extreme Low-Bitrate Diffusion Video Compression with INR Conditioning eess.IV · 2026-04-09 · unverdicted · none · ref 13
DiV-INR integrates implicit neural representations as conditioning signals for diffusion models to achieve better perceptual quality than HEVC, VVC, and prior neural codecs at extremely low bitrates under 0.05 bpp.
On the Decompositionality of Neural Networks cs.LO · 2026-04-09 · unverdicted · none · ref 14
Neural decompositionality is defined via decision-boundary semantic preservation, and language transformers largely satisfy it under SAVED while vision models often do not.
Revealing Physical-World Semantic Vulnerabilities: Universal Adversarial Patches for Infrared Vision-Language Models cs.CV · 2026-04-03 · unverdicted · none · ref 41
UCGP is a universal physical adversarial patch that compromises cross-modal semantic alignment in IR-VLMs through curved-grid parameterization and representation-space disruption.
LumaFlux: Lifting 8-Bit Worlds to HDR Reality with Physically-Guided Diffusion Transformers cs.CV · 2026-04-03 · unverdicted · none · ref 3
LumaFlux is a physically and perceptually guided diffusion transformer for SDR-to-HDR conversion that introduces PGA, PCM, and HDR Residual Coupler modules plus a new training corpus and benchmark, outperforming prior ITM methods.
CORP: Closed-Form One-shot Representation-Preserving Structured Pruning for Transformers cs.LG · 2026-02-05 · unverdicted · none · ref 8
CORP performs one-shot structured pruning of Transformers by modeling removed components as affine functions of retained ones and solving closed-form ridge regressions on calibration data to fold compensation into weights, retaining 83.27% Top-1 accuracy on DeiT-Huge after 50% pruning.
Learning to Build Shapes by Extrusion cs.GR · 2026-01-30 · unverdicted · none · ref 18
Text Encoded Extrusions (TEE) lets LLMs generate and edit manifold 3D meshes by learning sequences of face extrusions from decomposed quadrilateral meshes.
FaSTA$^*$: Fast-Slow Toolpath Agent with Subroutine Mining for Efficient Multi-turn Image Editing cs.CV · 2025-06-26 · unverdicted · none · ref 12
FaSTA* combines LLM fast planning with A* search and inductive subroutine mining to create an efficient agent for multi-turn image editing tasks.
From Standalone LLMs to Integrated Intelligence: A Survey of Compound Al Systems cs.MA · 2025-06-05 · accept · none · ref 60
A survey that defines Compound AI Systems, proposes a multi-dimensional taxonomy based on component roles and orchestration strategies, reviews four foundational paradigms, and identifies key challenges for future research.
High Volume Rate 3D Ultrasound Reconstruction with Diffusion Models eess.IV · 2025-05-28 · unverdicted · none · ref 41 · 2 links
Diffusion models reconstruct high-resolution 3D cardiac ultrasound volumes from heavily undersampled elevation planes and outperform traditional interpolation and supervised deep learning baselines.
MirrorCheck: Efficient Adversarial Defense for Vision-Language Models cs.CV · 2024-06-13 · unverdicted · none · ref 20 · 2 links
MirrorCheck detects adversarial attacks on VLMs via T2I regeneration for semantic consistency checks, using stochastic model selection and one-time perturbations for robustness against adaptive attacks.
Hyper-V2X: Hypernetworks for Estimating Epistemic and Aleatoric Uncertainty in Cooperative Bird's-Eye-View Semantic Segmentation cs.CV · 2026-05-20 · unverdicted · none · ref 7
Hyper-V2X uses a Bayesian hypernetwork with partial weight generation and V2X context embedding to produce calibrated epistemic and aleatoric uncertainty estimates for multi-agent BEV segmentation on the OPV2V benchmark.
Automatic Discovery of Disease Subgroups by Contrasting with Healthy Controls cs.LG · 2026-05-20 · conditional · none · ref 32
Deep UCSL uses a contrastive EM loss on patient-control labels to isolate disease-driven subgroups in medical imaging by suppressing shared healthy variability.
Towards Understanding Self-Pretraining for Sequence Classification cs.LG · 2026-05-20 · unverdicted · none · ref 11
Self-pretraining improves Transformer sequence classification by enabling learning of proximity-biased attention from positional encodings that label supervision alone cannot easily acquire from random starts.
Mechanisms of Misgeneralization in Physical Sequence Modeling cs.LG · 2026-05-19 · unverdicted · none · ref 24
Generative sequence models for physical tasks exhibit physical misgeneralization where local prediction errors propagate through physical measurements to distort aggregate distributions over quantities like distance or energy; a data deviation kernel explains and predicts the shifts and supports a内核
SegRAG: Training-Free Retrieval-Augmented Semantic Segmentation cs.CV · 2026-05-17 · unverdicted · none · ref 37 · 2 links
SegRAG is a training-free retrieval-augmented framework that extracts class-specific point prompts from a filtered DINOv3 feature bank to boost SAM3 semantic segmentation performance on standard and agricultural benchmarks.
MSIQ: Moment-based Scale-Invariant Quality Measure for Single Image Super-Resolution cs.CV · 2026-05-17 · unverdicted · none · ref 5
MSIQ is a scale-invariant, model-free quality metric for single image super-resolution using normalized central geometric moments for direct comparison of different-resolution images.
Prognostic Value of Lung Ultrasound Biomarkers for Readmission Risk in Congestive Heart Failure: A Pilot Data-Driven Analysis eess.SP · 2026-05-16 · unverdicted · none · ref 71
Pilot study uses pretrained video encoder features from lung ultrasound to predict 30-day CHF readmission, finding lower-lung views and temporal differences most informative with top MLP F1 of 0.80.
Decomposed Vision-Language Alignment for Fine-Grained Open-Vocabulary Segmentation cs.CV · 2026-05-15 · unverdicted · none · ref 18 · 3 links
Decomposed Vision-Language Alignment framework factorizes prompts into concept and attribute tokens with Feature-Gated Cross-Attention for better compositional generalization in fine-grained open-vocabulary segmentation.
On What We Can Learn from Low-Resolution Data cs.LG · 2026-05-12 · unverdicted · none · ref 22 · 2 links
Low-resolution data improves high-resolution model performance when high-resolution samples are limited, via KL-divergence bounds and experiments on vision transformers and CNNs.
EDGER: EDge-Guided with HEatmap Refinement for Generalizable Image Forgery Localization cs.CV · 2026-05-12 · unverdicted · none · ref 6 · 2 links
A dual-branch system using frequency edge cues and CLIP-based synthetic patch detection for accurate, resolution-independent image forgery localization.
TB-AVA: Text as a Semantic Bridge for Audio-Visual Parameter Efficient Finetuning cs.CV · 2026-05-12 · unverdicted · none · ref 13 · 2 links
TB-AVA uses text-mediated gated semantic modulation to enable efficient audio-visual alignment, achieving state-of-the-art results on AVE, AVS, and AVVP benchmarks.
Removing the Watermark Is Not Enough: Forensic Stealth in Generative-AI Watermark Removal cs.CR · 2026-05-09 · unverdicted · none · ref 35
Current AI image watermark removal attacks replace the watermark with a different forensic signal, allowing independent detectors to distinguish processed outputs from clean images at over 98% true-positive rate under a 1% false-positive budget.
Probability-Flow Distillation: Exact Wasserstein Gradient Flow for High-Fidelity 3D Generation cs.CV · 2026-05-09 · unverdicted · none · ref 28
Probability-Flow Distillation exactly matches the Wasserstein gradient flow of the target distribution when distilling 2D diffusion priors into 3D models, yielding higher-fidelity results than SDS or SDI.
Exposing and Mitigating Temporal Attack in Deepfake Video Detection cs.CV · 2026-05-08 · unverdicted · none · ref 16
SpInShield is a temporal spectral-invariant defense that decouples semantic motion from manipulatable spectral artifacts in deepfake detectors via a learnable adversary and shortcut suppression optimization.
EAPFusion: Intrinsic Evolving Auxiliary Prior Guidance for Infrared and Visible Image Fusion cs.CV · 2026-05-03 · unverdicted · none · ref 39
EAPFusion uses self-evolving intrinsic priors to produce dynamic, scene-adaptive convolution kernels and channel-mixing fusion for infrared-visible images, reporting state-of-the-art results and downstream gains.
Model Merging: Foundations and Algorithms cs.LG · 2026-05-02 · unverdicted · none · ref 70
New cycle-consistent optimization, task vector theory, singular vector decompositions, adaptive routing, and efficient evolutionary search provide foundations for merging neural network weights across tasks.
Neighbor2Inverse: Self-Supervised Denoising for Low-Dose Region-of-Interest Phase Contrast CT cs.CV · 2026-05-01 · unverdicted · none · ref 14
Neighbor2Inverse adapts the Neighbor2Neighbor principle to train a denoising network directly in the image domain for low-dose PBI-CT by using independently noised subsampled projections.
CSC: Turning the Adversary's Poison against Itself cs.CR · 2026-04-23 · unverdicted · none · ref 37
CSC identifies backdoored samples via early-epoch latent clustering and conceals them by relabeling to a virtual class, driving attack success rates near zero on benchmarks with little clean accuracy loss.
Where are they looking in the operating room? cs.CV · 2026-04-22 · unverdicted · none · ref 42
Gaze-following models on extended 4D-OR and Team-OR datasets reach F1 scores of 0.92 for clinical role prediction and 0.95 for surgical phase recognition while improving team communication detection by over 30%.
Explicit Dropout: Deterministic Regularization for Transformer Architectures cs.LG · 2026-04-22 · unverdicted · none · ref 20
Explicit dropout reformulates stochastic dropout as deterministic loss penalties for Transformers, matching or exceeding standard performance with independent control per component.
Random Walk on Point Clouds for Feature Detection cs.CV · 2026-04-22 · unverdicted · none · ref 34 · 2 links
RWoDSN extracts feature points from point clouds via a novel DSN descriptor and random walk graph analysis, reporting 22% higher recall than prior state-of-the-art with 0.784 precision.

ImageBind One Embedding Space to Bind Them All

hub tools

citation-role summary

citation-polarity summary

authors

co-cited works

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer