hub

author Tang, Y

Gudovskiy, D · 2022 · arXiv 1458.2022

16 Pith papers cite this work. Polarity classification is still indexing.

16 Pith papers citing it

read on arXiv browse 16 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 3 dataset 1

citation-polarity summary

background 4

representative citing papers

Cyclic 2.5D Perceptual Loss for Cross-Modal 3D Medical Image Synthesis: T1w MRI to Tau PET

eess.IV · 2024-06-18 · unverdicted · novelty 7.0

Proposes a cyclic 2.5D perceptual loss with manufacturer SUVR standardization for T1w MRI to tau PET synthesis, reporting improved regional agreement on ADNI and SCAN cohorts across U-Net, UNETR, SwinUNETR, CycleGAN, and Pix2Pix.

Evaluating Object Hallucination in Large Vision-Language Models

cs.CV · 2023-05-17 · accept · novelty 7.0

Large vision-language models exhibit severe object hallucination that varies with training instructions, and the proposed POPE polling method evaluates it more stably and flexibly than prior approaches.

Anomaly Factory 3D: A Modular Framework for Diverse Pseudo-Anomaly Synthesis in Unsupervised 3D Anomaly Detection

cs.CV · 2026-06-28 · unverdicted · novelty 6.0

AF3AD is a modular synthesis framework using center-conditioned parametric deformations in local PCA frames to create diverse pseudo-anomalies, improving unsupervised 3D anomaly detection on AnomalyShapeNet and Real3D-AD.

Bengal-HP_RU: A Dataset of Bengal People For Head Pose Estimation

cs.CV · 2026-06-23 · unverdicted · novelty 6.0 · 2 refs

Bengal-HP_RU is the first publicly available head pose dataset for Bengali subjects, with 12,894 images collected from Wikimedia Commons and partitioned by uploader identity.

Controllable Texture Tiling with Transformed RoPE-Enhanced Diffusion Models

cs.GR · 2026-06-22 · unverdicted · novelty 6.0

A Diffusion Transformer framework applies coordinate-transformed RoPE and disjoint attention masks to achieve controllable, high-fidelity texture tiling that preserves reference structure and scene lighting.

Enhancing Multilingual Reasoning via Steerable Model Merging

cs.CL · 2026-06-17 · unverdicted · novelty 6.0

ST-Merge uses gated cross-attention to adaptively weight source models during merging, outperforming baselines on multilingual reasoning tasks across 21 languages.

MS-DKC: A Dataset Knowledge Card Framework for Designing and Adapting Medical Image Segmentation Models

cs.CV · 2026-06-04 · unverdicted · novelty 6.0

MS-DKC is a dataset knowledge card framework that maps image, morphology, supervision, context, and risk descriptors to design priors and failure modes, shown to produce dataset-specific model adaptations with improved metrics on DRIVE, ISIC2018, and ACDC.

AdaCodec: A Predictive Visual Code for Video MLLMs

cs.CV · 2026-06-01 · unverdicted · novelty 6.0

AdaCodec introduces a predictive visual code that cuts visual token use in video MLLMs by sending full frames only on high predictive cost and otherwise encoding inter-frame changes as P-tokens, yielding better benchmark scores at lower budgets.

Deep Psychovisual Image Representations

cs.CV · 2026-05-28 · unverdicted · novelty 6.0

Proposes a psychovisual-inspired deep learning method that encodes images in learned frequency sub-bands for interpretable semantic structures and reduced depth dependence.

Utilizing Inpainting for Keypoint Detection for Vision-Based Control of Robotic Manipulators

cs.RO · 2026-04-14 · unverdicted · novelty 6.0

A framework trains keypoint detectors on inpainted markerless robot images and uses runtime inpainting plus UKF for robust vision-based control without models or calibration.

Rotary Masked Autoencoders are Versatile Learners

cs.LG · 2025-05-26 · unverdicted · novelty 6.0

RoMAE applies rotary positional embeddings to masked autoencoders to enable representation learning and interpolation on continuous positional data across irregular time-series, images, and audio without modality-specific modifications.

$\mu$Flow: Leveraging Average Images for Improving Generalisation of Deepfake Faces Detectors

cs.CV · 2026-06-29 · unverdicted · novelty 5.0

μFlow trains a normalizing flow on averaged real-image features to detect deepfakes via likelihood in a fully out-of-distribution setting.

Out-of-Distribution Generalization in Time Series: A Survey

cs.LG · 2025-03-18 · unverdicted · novelty 5.0

This is the first comprehensive survey of OOD generalization methodologies for time series, organized across data distribution, representation learning, and OOD evaluation.

Autonomous Unmanned Aircraft Systems for Enhanced Search and Rescue of Drowning Swimmers: Image-Based Localization and Mission Simulation

cs.CV · 2026-04-20 · unverdicted · novelty 4.0

A UAS with YOLO-based swimmer detection and DES simulations reduces drowning rescue response time by a factor of five versus standard operations in tested lake areas.

PaliGemma: A versatile 3B VLM for transfer

cs.CV · 2024-07-10 · unverdicted · novelty 4.0

PaliGemma is an open 3B VLM based on SigLIP and Gemma that achieves strong performance on nearly 40 diverse open-world tasks including benchmarks, remote-sensing, and segmentation.

SoK: A Comprehensive Analysis of the Current Status of Neural Tangent Generalization Attacks with Research Directions

cs.LG · 2026-05-12 · accept · novelty 3.0

NTGA is the first clean-label generalization attack under black-box settings but is vulnerable to adversarial training and image transformations, with newer attacks outperforming it.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Evaluating Object Hallucination in Large Vision-Language Models cs.CV · 2023-05-17 · accept · none · ref 7
Large vision-language models exhibit severe object hallucination that varies with training instructions, and the proposed POPE polling method evaluates it more stably and flexibly than prior approaches.

author Tang, Y

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer