super hub Canonical reference

In: Proceedings of the IEEE/CVF Conference on Computer 25 Vision and Pattern Recognition, pp

Abdal, Peter, Rameen and Qin, year=, Yipeng and Wonka · 2020 · arXiv 2600.2020

Canonical reference. 71% of citing Pith papers cite this work as background.

190 Pith papers citing it

Background 71% of classified citations

read on arXiv browse 190 citing papers more from Abdal

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 31 dataset 9 method 5 baseline 3

citation-polarity summary

background 34 use dataset 6 use method 5 baseline 3

authors

Abdal Peter Rameen and Qin year= Yipeng and Wonka

co-cited works

representative citing papers

WildBox: A Dataset and Benchmark for Aerial Monocular 3D Detection of African Savanna Wildlife

cs.CV · 2026-06-19 · unverdicted · novelty 8.0

WildBox provides over 237k 3D wildlife annotations from drone video and benchmarks reveal zero-shot 3D detection at 0 AP but fine-tuned performance of 8.68 AP-BEV and 13.17 AP3D, with depth estimation causing most errors.

Mind2Web: Towards a Generalist Agent for the Web

cs.CL · 2023-06-09 · accept · novelty 8.0

Mind2Web is the first large-scale dataset of real-world web tasks for developing generalist language-guided agents that complete complex actions on diverse websites.

Learning Spectral and Polarimetric Clues for One-to-Multimodal Novel View Synthesis

cs.CV · 2026-07-02 · unverdicted · novelty 7.0

SPoILeR uses multimodal pre-training to enable accurate novel view synthesis of infrared, polarimetric, and multispectral data from RGB-supervised fine-tuning on new scenes.

MoHallBench: A Benchmark for Motion Hallucination in Video Large Language Models

cs.CV · 2026-07-01 · unverdicted · novelty 7.0

MoHallBench is a new benchmark evaluating motion hallucination in VideoLLMs from co-occurrence priors, sequential inference, and similarity confusion, revealing decoupling from action recognition performance.

PRISM-VO: Scale-Aware Visual Odometry Using Photometric Plenoptic Bundle Adjustment

cs.CV · 2026-06-30 · unverdicted · novelty 7.0

PRISM-VO introduces photometric plenoptic bundle adjustment for drift-resilient, metric-scale visual odometry from a single focused plenoptic camera.

Learning to Deny: Action Denial in Multimodal Large Language Models

cs.CV · 2026-06-30 · unverdicted · novelty 7.0 · 3 refs

MLLMs drop from over 85% accuracy on action presence to under 50% on matched action-denial videos, exposing a causal verification gap that causal graph prompts partially close.

Diffusion-Based Material Regularization for Physics-Based Inverse Rendering

cs.CV · 2026-06-30 · unverdicted · novelty 7.0

A regularization technique that treats diffusion model outputs as a similarity kernel during material optimization in inverse rendering, enabling joint reconstruction of geometry, materials, and illumination that satisfies the rendering equation and generalizes to new lighting.

HASTE: A Framework for Training-Free, Dynamic, and Steerable Compression of Pre-Trained Convolutional Neural Networks

cs.CV · 2026-06-29 · unverdicted · novelty 7.0 · 2 refs

HASTE enables training-free dynamic compression of pre-trained CNNs by patch-wise LSH-based merging of redundant channels, reporting 46.2% FLOPs reduction on ResNet34 CIFAR-10 with 1.25% accuracy drop.

AirGroundBench: Probing Spatial Intelligence in Multimodal Large Models under Heterogeneous Multi-View Embodied Collaboration

cs.CV · 2026-06-26 · unverdicted · novelty 7.0

AirGroundBench is a new diagnostic benchmark exposing that MLLMs handle basic spatial perception but struggle with cross-view alignment, transformation reasoning, and embodied navigation under heterogeneous air-ground views.

ScaLe-INR: Scale and Learn Implicit Neural Representations

cs.CV · 2026-06-26 · unverdicted · novelty 7.0

ScaLe-INR is a multi-branch INR architecture that applies directional scaling per the Fourier inverse theorem and a directional edge guidance loss to disentangle scales and improve reconstruction fidelity.

Semantic Browsing: Controllable Diversity for Image Generation

cs.CV · 2026-06-22 · unverdicted · novelty 7.0

A technique for controllable diversity in text-to-image generation by inducing structured semantic variations at the prompt level via VLM and agentic workflow.

4DVLT: Dynamic Scene Understanding with Worldline-Centered Vision-Language Tracking

cs.CV · 2026-06-21 · conditional · novelty 7.0 · 2 refs

The paper defines the 4DVLT task for worldline-centered 4D scene understanding, releases Instruct-4D with 129.4K QA pairs, and presents 4DTrack achieving 62.68 TGA_Top1, outperforming adapted baselines by 19.62 points.

Deep Unrolled Networks in Representation Space Applied to MRI Reconstruction

eess.IV · 2026-06-19 · unverdicted · novelty 7.0

DUNE enables exact data-consistency gradients via VJP when deep unrolled networks operate in representation space, yielding better MRI reconstructions than prior heuristic-DC variants.

Does Text Actually Help? Uncovering and Resolving Text Collapse in Multimodal Time Series Forecasting

cs.LG · 2026-06-17 · unverdicted · novelty 7.0

REST-TS resolves text collapse in multimodal time series forecasting by exclusively supervising the text branch on numerical residuals to compel genuine content extraction from text descriptions.

Human Universal Grasping

cs.RO · 2026-06-15 · unverdicted · novelty 7.0

HUG trains a flow-matching model on a new 1M-frame egocentric human grasp dataset to generate retargetable grasps from single RGB-D images, beating baselines by 23-34% on a new 90-object benchmark.

iSAGE: A Human-in-the-Loop Framework for Remote Sensing Semantic Segmentation via Sparse Point Supervision

cs.CV · 2026-06-08 · unverdicted · novelty 7.0

iSAGE achieves near-dense mIoU performance in remote sensing semantic segmentation using iterative expert clicks on confident model errors with an error-weighted loss, using only 0.011-0.04% of pixels.

Targeting World Models to Compromise Robot Learning Pipelines

cs.RO · 2026-06-08 · unverdicted · novelty 7.0

World models introduce a stealthy poisoning vector into robot learning pipelines where malicious prompts or dynamics in teleoperated data activate only during synthetic trajectory generation, enabling backdoors in downstream policies.

Bridging CAD and Data-Driven Design: Attributed Feature Graphs for Engineering Design

cs.CE · 2026-06-04 · unverdicted · novelty 7.0

Attributed Feature Graphs (AFGs) represent CAD features as attributed nodes and relations as directed edges to enable GNN surrogate models that predict design performance with feature-level interpretability on the CarHoods10K dataset.

LL-Bench: Rethinking Low-Level Vision Evaluation in the Era of Large-Scale Generative Models

cs.CV · 2026-06-01 · unverdicted · novelty 7.0

LL-Bench supplies a human-annotated dataset exposing generative model weaknesses in low-level restoration and introduces LL-Score as an MLLM evaluator that outperforms existing quality metrics and can serve as a training reward.

DPA4: Pushing the Accuracy-Cost Frontier of Interatomic Potentials with EMFA SO(2) Convolution

physics.chem-ph · 2026-06-01 · unverdicted · novelty 7.0

DPA4 is a new SE(3)-equivariant interatomic potential with EMFA SO(2) convolution that sets new accuracy-cost records on Matbench Discovery and SPICE benchmarks using fewer parameters than prior models.

Quality-Guided Semi-Supervised Learning for Medical Image Segmentation

cs.CV · 2026-06-01 · unverdicted · novelty 7.0

A new quality-guided approach for semi-supervised medical image segmentation that trains a predictor on synthetic errors to enhance pseudolabel handling.

DELOS: Detecting Shallow Transits in Kepler Photometry Using a Contrastive-Learning Framework

astro-ph.EP · 2026-05-28 · conditional · novelty 7.0 · 2 refs

DELOS applies contrastive learning to phase-folded light curves to detect shallow intermediate-to-long period transits, reporting 15.5% and 11.25% gains in combined precision-recall over BLS and TLS in low-SNR tests plus 3-80x speedups.

The Abstraction Gap in Vision-Language Causal Reasoning

cs.CL · 2026-05-27 · unverdicted · novelty 7.0

Introduces Abstraction Gap metric and CAGE benchmark showing seven of eight VLMs have large gaps between text plausibility and chain-based causal reasoning, with one model succeeding.

Category-Level 3D Correspondence in Camera Space via Morphable Object Priors

cs.CV · 2026-05-27 · unverdicted · novelty 7.0 · 2 refs

Morpheus learns morphable category-level shape priors to produce implicit 3D correspondences in camera space without explicit supervision and releases the HouseCorr3D benchmark with amodal and symmetry annotations.

citing papers explorer

Showing 23 of 23 citing papers after filters.

Hyperbolic Concept Bottleneck Models cs.LG · 2026-05-07 · unverdicted · none · ref 27
HypCBM reformulates concept activations as geometric containment in hyperbolic space to produce sparse, hierarchy-aware signals that match Euclidean models trained on 20 times more data.
Continuous Expert Assembly: Instance-Conditioned Low-Rank Residuals for All-in-One Image Restoration cs.CV · 2026-05-07 · unverdicted · none · ref 6
CEA assembles per-token low-rank residual updates via dense affinities over hyper-adapter-generated components to improve all-in-one image restoration on spatially non-uniform degradations.
Evaluating LLMs on Large-Scale Graph Property Estimation via Random Walks cs.LG · 2026-05-02 · unverdicted · none · ref 17 · 2 links
EstGraph benchmark evaluates LLMs on estimating properties of very large graphs from random-walk samples that fit in context limits.
Divide-and-Conquer Approach to Holistic Cognition in High-Similarity Contexts with Limited Data cs.CV · 2026-04-21 · unverdicted · none · ref 30 · 2 links
DHCNet improves ultra-fine-grained visual categorization by progressively building holistic cognition from local discrepancies using self-shuffling and refinement on limited data.
A global dataset of continuous urban dashcam driving cs.CV · 2026-04-01 · accept · none · ref 12
CROWD is a new global dataset of 51,753 continuous urban dashcam segments spanning over 20,000 hours from 238 countries, with manual labels and automated object detections for routine driving analysis.
Weather-Robust Cross-View Geo-Localization via Prototype-Based Semantic Part Discovery cs.CV · 2026-05-12 · unverdicted · none · ref 50 · 2 links
SkyPart achieves state-of-the-art single-pass cross-view geo-localization on SUES-200, University-1652, and DenseUAV by using prototype-based part discovery, altitude-conditioned modulation, and Kendall-weighted loss, with widening gains under weather corruptions.
MAG-VLAQ: Multi-modal Aerial-Ground Query Aggregation for Cross-View Place Recognition cs.CV · 2026-05-10 · unverdicted · none · ref 5 · 2 links
MAG-VLAQ fuses multi-modal ground and aerial data via ODE-conditioned vector-of-locally-aggregated-queries to nearly double recall@1 on aerial-ground place recognition benchmarks.
Removing the Watermark Is Not Enough: Forensic Stealth in Generative-AI Watermark Removal cs.CR · 2026-05-09 · unverdicted · none · ref 9 · 2 links
Current AI image watermark removal attacks replace the watermark with a different forensic signal, allowing independent detectors to distinguish processed outputs from clean images at over 98% true-positive rate under a 1% false-positive budget.
Model Merging: Foundations and Algorithms cs.LG · 2026-05-02 · unverdicted · none · ref 81
New cycle-consistent optimization, task vector theory, singular vector decompositions, adaptive routing, and efficient evolutionary search provide foundations for merging neural network weights across tasks.
Where are they looking in the operating room? cs.CV · 2026-04-22 · unverdicted · none · ref 8
Gaze-following models on extended 4D-OR and Team-OR datasets reach F1 scores of 0.92 for clinical role prediction and 0.95 for surgical phase recognition while improving team communication detection by over 30%.
Harnessing Weak Pair Uncertainty for Text-based Person Search cs.CV · 2026-04-10 · conditional · none · ref 20
Uncertainty estimation and regularization on weak positive pairs improves mAP by 3.06%, 3.55%, and 6.94% on CUHK-PEDES, RSTPReid, and ICFG-PEDES respectively.
Toward Unified Fine-Grained Vehicle Classification and Automatic License Plate Recognition cs.CV · 2026-04-07 · accept · none · ref 10
UFPR-VeSV is a new real-world dataset for fine-grained vehicle classification and automatic license plate recognition collected from Brazilian police cameras, with benchmarks demonstrating its difficulty and the value of joint task use.
Multi-Narrow Transformation as a Single-Model Ensemble: Boundary Conditions, Mechanisms, and Failure Modes cs.LG · 2026-05-12 · unverdicted · none · ref 23
Multi-narrow single-model ensembles outperform wide baselines in low-data image classification by learning diverse features but underperform in data-rich settings where training favors few paths.
Dual-stream Spatio-Temporal GCN-Transformer Network for 3D Human Pose Estimation cs.CV · 2026-04-20 · unverdicted · none · ref 15
MixTGFormer reports state-of-the-art 3D pose estimation errors of 37.6 mm on Human3.6M and 15.7 mm on MPI-INF-3DHP by using parallel GCN-Transformer streams with SE layers for local-global feature fusion.
Weak-to-Strong Knowledge Distillation Accelerates Visual Learning cs.CV · 2026-04-16 · unverdicted · none · ref 53
Weak-to-strong knowledge distillation applied early and then turned off accelerates convergence to target performance in visual learning tasks by factors of 1.7-4.8x.
Protecting and Preserving Protest Dynamics for Responsible Analysis cs.CV · 2026-04-06 · unverdicted · none · ref 11 · 3 links
A responsible computing framework substitutes real protest imagery with labeled synthetic reproductions from conditional image synthesis to enable privacy-aware analysis of collective action patterns.
XiYOLO: Energy-Aware Object Detection via Iterative Architecture Search and Scaling cs.CV · 2026-05-07 · unverdicted · none · ref 33 · 2 links
XiYOLO uses iterative energy-aware neural architecture search and scaling to produce object detectors with stronger accuracy-energy tradeoffs than YOLO baselines on GPUs and NPUs.
RoomRecon: High-Quality Textured Room Layout Reconstruction on Mobile Devices cs.RO · 2026-04-21 · unverdicted · none · ref 24
RoomRecon delivers a real-time mobile system for high-quality textured 3D room reconstructions that combines AR-guided imaging with generative AI texturing focused on permanent structures and claims to outperform prior methods in quality and speed.
Trajectory Prediction for Autonomous Driving: Progress, Limitations, and Future Directions cs.RO · 2025-03-05 · unverdicted · none · ref 17 · 3 links
A survey of trajectory prediction techniques for autonomous vehicles that proposes a taxonomy, overviews the prediction pipeline, and highlights remaining research gaps.
Generalization Under Scrutiny: Cross-Domain Detection Progresses, Pitfalls, and Persistent Challenges cs.CV · 2026-04-09 · unverdicted · none · ref 19 · 5 links
A survey that organizes methods for cross-domain object detection into a taxonomy, analyzes domain shift across detection stages, and outlines persistent challenges.
Software Engineering for Self-Adaptive Robotics: A Research Agenda cs.SE · 2025-05-26 · unverdicted · none · ref 132
This paper proposes a research agenda for software engineering of self-adaptive robotic systems along lifecycle stages and enabling technologies, identifying challenges and a roadmap to 2030.
Explainable Artificial Intelligence Techniques for Interpretation of Food Models: a Review cs.AI · 2025-04-12 · unverdicted · none · ref 49
A survey proposing a taxonomy of XAI techniques for food quality research organized by data types and explanation methods.
A Survey on Deep Learning Architectures for Point Cloud Classification and Segmentation cs.CV · 2026-05-16 · unverdicted · none · ref 96 · 4 links
A survey that categorizes deep learning models for point cloud tasks by backbone architecture, evaluates benchmark performance, and outlines challenges and future research directions.

In: Proceedings of the IEEE/CVF Conference on Computer 25 Vision and Pattern Recognition, pp

hub tools

citation-role summary

citation-polarity summary

authors

co-cited works

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer