hub Tool reference

Deep residual learning for image recognition

· 2016

Tool reference. 80% of classified Pith citations use this work as a method, library, or software dependency, not as a substantive claim.

69 Pith papers citing it

Method reference 80% of classified citations

browse 69 citing papers

hub tools

JSON dossier citing papers JSON

citation-role summary

method 8 background 1 baseline 1

citation-polarity summary

use method 8 background 1 baseline 1

representative citing papers

iMiGUE-3K: A Large-Scale Benchmark for Micro-Gesture Analysis with Self-Supervised Learning

cs.CV · 2026-05-16 · unverdicted · novelty 8.0

iMiGUE-3K is the largest in-the-wild micro-gesture video dataset with 3.4K clips and 37M frames from real interviews, supporting self-supervised foundation models and benchmarks that show micro-gestures improve emotion understanding.

MetaEarth-MM: Unified Multimodal Remote Sensing Image Generation with Scene-centered Joint Modeling

cs.CV · 2026-05-19 · conditional · novelty 7.0

MetaEarth-MM unifies multi-modal remote sensing image generation and any-to-any translation across five modalities via scene-centered joint modeling on the new EarthMM dataset.

Interactive State Space Model with Cross-Modal Local Scanning for Depth Super-Resolution

cs.CV · 2026-05-12 · unverdicted · novelty 7.0

A Mamba-based interactive state space model with cross-modal local scanning achieves competitive guided depth super-resolution performance at linear computational cost.

Classification-Head Bias in Class-Level Machine Unlearning: Diagnosis, Mitigation, and Evaluation

cs.LG · 2026-05-09 · conditional · novelty 7.0

Class-level unlearning shortcuts via bias suppression in the classification head; new bias-aware training mechanisms and bias-specific metrics are introduced to diagnose and reduce this dependence.

InterMesh: Explicit Interaction-Aware End-to-End Multi-Person Human Mesh Recovery

cs.CV · 2026-05-06 · conditional · novelty 7.0

InterMesh explicitly incorporates human-object interaction semantics into multi-person mesh recovery via a detector and two lightweight modules, delivering up to 9.9% MPJPE reduction on interaction-heavy datasets.

ShapeGrasp: Simultaneous Visuo-Haptic Shape Completion and Grasping for Improved Robot Manipulation

cs.RO · 2026-05-04 · conditional · novelty 7.0

ShapeGrasp improves grasp success on unknown objects to 84-91% by iteratively updating a 3D shape model with visuo-haptic feedback during real-world grasp attempts.

Learning from Compressed CT: Feature Attention Style Transfer and Structured Factorized Projections for Resource-Efficient Medical Image Analysis

cs.CV · 2026-05-01 · unverdicted · novelty 7.0

CT-Lite combines Feature Attention Style Transfer (FAST) and Structured Factorized Projections (SFP) with contrastive learning to reach AUROC within 5-7% of uncompressed baselines on compressed CT volumes across three datasets while using far fewer parameters.

Hierarchical Spatio-Channel Clustering for Efficient Model Compression in Medical Image Analysis

cs.CV · 2026-04-25 · unverdicted · novelty 7.0

A spatio-channel clustering framework for CNN compression reduces FLOPs by 81% and raises brain tumor MRI classification accuracy from 87.76% to 89.80% compared with global SVD and Tucker baselines.

Latent Space Probing for Adult Content Detection in Video Generative Models

cs.CV · 2026-04-25 · unverdicted · novelty 7.0

Latent space probing on CogVideoX achieves 97.29% F1 for adult content detection on a new 11k-clip dataset with 4-6ms overhead.

Channel-Level Semantic Perturbations: Unlearnable Examples for Diverse Training Paradigms

cs.LG · 2026-04-18 · unverdicted · novelty 7.0

Unlearnable examples fail under pretraining-finetuning due to semantic filtering by frozen layers, but Shallow Semantic Camouflage restores effectiveness by confining perturbations to semantically valid subspaces.

Physically-Induced Atmospheric Adversarial Perturbations: Enhancing Transferability and Robustness in Remote Sensing Image Classification

cs.CV · 2026-04-16 · unverdicted · novelty 7.0

FogFool creates fog-based adversarial perturbations using Perlin noise optimization to achieve high black-box transferability (83.74% TASR) and robustness to defenses in remote sensing classification.

CDPR: Cross-modal Diffusion with Polarization for Reliable Monocular Depth Estimation

cs.CV · 2026-04-13 · unverdicted · novelty 7.0

CDPR integrates polarization priors into a diffusion-based monocular depth estimator via shared latent space and adaptive gating, outperforming RGB-only methods in challenging scenes.

Generalized Small Object Detection:A Point-Prompted Paradigm and Benchmark

cs.CV · 2026-04-03 · unverdicted · novelty 7.0

TinySet-9M dataset and DEAL point-prompted framework deliver 31.4% relative AP75 gain over supervised baselines for small object detection with one click at inference and generalization to unseen categories.

Sparse Bayesian Learning Algorithms Revisited: From Learning Majorizers to Structured Algorithmic Learning using Neural Networks

eess.SP · 2026-04-02 · conditional · novelty 7.0

SBL algorithms are unified under majorization-minimization with new convergence results, and a dimension-invariant neural network learns superior data-driven update rules that generalize across matrices and parameters.

Beyond Corner Patches: Semantics-Aware Backdoor Attack in Federated Learning

cs.CR · 2026-03-31 · unverdicted · novelty 7.0

SABLE shows that semantics-aware natural triggers enable effective backdoor attacks in federated learning against multiple aggregation rules while preserving benign accuracy.

Membership Inference for Contrastive Pre-training Models with Text-only PII Queries

cs.CR · 2026-03-15 · unverdicted · novelty 7.0

UMID infers membership in contrastive pre-training data using only text queries by performing latent inversion and comparing similarity and variability signals to synthetic gibberish references via unsupervised anomaly detection.

CBEN -- A Multimodal Machine Learning Dataset for Cloud Robust Remote Sensing Image Understanding

cs.CV · 2026-02-13 · accept · novelty 7.0

CBEN provides paired optical-radar images with cloud occlusion, revealing 23-33 point AP drops in clear-sky trained models and 17-29 point relative gains when models are trained on cloudy data.

CoLA-Flow Policy: Temporally Coherent Imitation Learning via Continuous Latent Action Flow Matching for Robotic Manipulation

cs.RO · 2026-01-30 · unverdicted · novelty 7.0 · 2 refs

CoLA-Flow Policy encodes action sequences into a continuous latent space and learns an explicit flow there, yielding near-single-step inference with up to 93.7% smoother trajectories and 25-point higher task success than raw-action flow baselines.

Building Deep Graph Predictors with Graph Imitation Learning

cs.CV · 2026-01-21 · unverdicted · novelty 7.0

GRAIL trains graph predictors via imitation learning by modeling generation as sequential decisions on partial graph embeddings, matching or exceeding prior methods on 18 benchmarks.

Re-Key-Free, Risky-Free: Adaptable Model Usage Control

cs.CR · 2025-11-24 · unverdicted · novelty 7.0

AdaLoc keeps a model locked to authorized users by confining all post-deployment updates to a chosen subset of weights, preserving both task performance for authorized use and near-random accuracy for unauthorized use across vision and language models.

EmbodiTTA: Resource-Efficient Test-Time Adaptation for Embodied Visual Systems

cs.LG · 2025-05-02 · unverdicted · novelty 7.0

OD-TTA enables resource-efficient test-time adaptation on edge devices by triggering updates only on detected domain shifts, achieving comparable accuracy with lower energy and computation costs for embodied visual systems.

Lowering the Barrier to IREX Participation: Open-Source Algorithms, Toolkit, and Benchmarking for Iris Recognition

cs.CV · 2026-05-20 · accept · novelty 6.0

Open-source neural network iris matchers (TripletIris using batch-hard triplet loss and ArcIris using ArcFace loss) plus compliant C++ implementations of HDBIF and CRYPTS are released, evaluated on IREX X and eight academic datasets, and accompanied by segmentation tools to lower entry barriers for

TAR: Text Semantic Assisted Cross-modal Image Registration Framework for Optical and SAR Images

cs.CV · 2026-05-12 · unverdicted · novelty 6.0

TAR uses frozen text encoders on remote sensing scene descriptions to boost high-level features for coarse-to-fine optical-SAR image registration under large deformations.

Reward-Guided Semantic Evolution for Test-time Adaptive Object Detection

cs.CV · 2026-05-06 · unverdicted · novelty 6.0

RGSE adapts text embeddings at test time via evolutionary search, using cosine similarity rewards from high-confidence visual proposals to improve open-vocabulary object detection under distribution shifts.

citing papers explorer

Showing 50 of 69 citing papers.

iMiGUE-3K: A Large-Scale Benchmark for Micro-Gesture Analysis with Self-Supervised Learning cs.CV · 2026-05-16 · unverdicted · none · ref 49
iMiGUE-3K is the largest in-the-wild micro-gesture video dataset with 3.4K clips and 37M frames from real interviews, supporting self-supervised foundation models and benchmarks that show micro-gestures improve emotion understanding.
MetaEarth-MM: Unified Multimodal Remote Sensing Image Generation with Scene-centered Joint Modeling cs.CV · 2026-05-19 · conditional · none · ref 97
MetaEarth-MM unifies multi-modal remote sensing image generation and any-to-any translation across five modalities via scene-centered joint modeling on the new EarthMM dataset.
Interactive State Space Model with Cross-Modal Local Scanning for Depth Super-Resolution cs.CV · 2026-05-12 · unverdicted · none · ref 23
A Mamba-based interactive state space model with cross-modal local scanning achieves competitive guided depth super-resolution performance at linear computational cost.
Classification-Head Bias in Class-Level Machine Unlearning: Diagnosis, Mitigation, and Evaluation cs.LG · 2026-05-09 · conditional · none · ref 35
Class-level unlearning shortcuts via bias suppression in the classification head; new bias-aware training mechanisms and bias-specific metrics are introduced to diagnose and reduce this dependence.
InterMesh: Explicit Interaction-Aware End-to-End Multi-Person Human Mesh Recovery cs.CV · 2026-05-06 · conditional · none · ref 33
InterMesh explicitly incorporates human-object interaction semantics into multi-person mesh recovery via a detector and two lightweight modules, delivering up to 9.9% MPJPE reduction on interaction-heavy datasets.
ShapeGrasp: Simultaneous Visuo-Haptic Shape Completion and Grasping for Improved Robot Manipulation cs.RO · 2026-05-04 · conditional · none · ref 50
ShapeGrasp improves grasp success on unknown objects to 84-91% by iteratively updating a 3D shape model with visuo-haptic feedback during real-world grasp attempts.
Learning from Compressed CT: Feature Attention Style Transfer and Structured Factorized Projections for Resource-Efficient Medical Image Analysis cs.CV · 2026-05-01 · unverdicted · none · ref 24
CT-Lite combines Feature Attention Style Transfer (FAST) and Structured Factorized Projections (SFP) with contrastive learning to reach AUROC within 5-7% of uncompressed baselines on compressed CT volumes across three datasets while using far fewer parameters.
Hierarchical Spatio-Channel Clustering for Efficient Model Compression in Medical Image Analysis cs.CV · 2026-04-25 · unverdicted · none · ref 37
A spatio-channel clustering framework for CNN compression reduces FLOPs by 81% and raises brain tumor MRI classification accuracy from 87.76% to 89.80% compared with global SVD and Tucker baselines.
Latent Space Probing for Adult Content Detection in Video Generative Models cs.CV · 2026-04-25 · unverdicted · none · ref 46
Latent space probing on CogVideoX achieves 97.29% F1 for adult content detection on a new 11k-clip dataset with 4-6ms overhead.
Channel-Level Semantic Perturbations: Unlearnable Examples for Diverse Training Paradigms cs.LG · 2026-04-18 · unverdicted · none · ref 51
Unlearnable examples fail under pretraining-finetuning due to semantic filtering by frozen layers, but Shallow Semantic Camouflage restores effectiveness by confining perturbations to semantically valid subspaces.
Physically-Induced Atmospheric Adversarial Perturbations: Enhancing Transferability and Robustness in Remote Sensing Image Classification cs.CV · 2026-04-16 · unverdicted · none · ref 49
FogFool creates fog-based adversarial perturbations using Perlin noise optimization to achieve high black-box transferability (83.74% TASR) and robustness to defenses in remote sensing classification.
CDPR: Cross-modal Diffusion with Polarization for Reliable Monocular Depth Estimation cs.CV · 2026-04-13 · unverdicted · none · ref 56
CDPR integrates polarization priors into a diffusion-based monocular depth estimator via shared latent space and adaptive gating, outperforming RGB-only methods in challenging scenes.
Generalized Small Object Detection:A Point-Prompted Paradigm and Benchmark cs.CV · 2026-04-03 · unverdicted · none · ref 74
TinySet-9M dataset and DEAL point-prompted framework deliver 31.4% relative AP75 gain over supervised baselines for small object detection with one click at inference and generalization to unseen categories.
Sparse Bayesian Learning Algorithms Revisited: From Learning Majorizers to Structured Algorithmic Learning using Neural Networks eess.SP · 2026-04-02 · conditional · none · ref 43
SBL algorithms are unified under majorization-minimization with new convergence results, and a dimension-invariant neural network learns superior data-driven update rules that generalize across matrices and parameters.
Beyond Corner Patches: Semantics-Aware Backdoor Attack in Federated Learning cs.CR · 2026-03-31 · unverdicted · none · ref 43
SABLE shows that semantics-aware natural triggers enable effective backdoor attacks in federated learning against multiple aggregation rules while preserving benign accuracy.
Membership Inference for Contrastive Pre-training Models with Text-only PII Queries cs.CR · 2026-03-15 · unverdicted · none · ref 52
UMID infers membership in contrastive pre-training data using only text queries by performing latent inversion and comparing similarity and variability signals to synthetic gibberish references via unsupervised anomaly detection.
CBEN -- A Multimodal Machine Learning Dataset for Cloud Robust Remote Sensing Image Understanding cs.CV · 2026-02-13 · accept · none · ref 40
CBEN provides paired optical-radar images with cloud occlusion, revealing 23-33 point AP drops in clear-sky trained models and 17-29 point relative gains when models are trained on cloudy data.
CoLA-Flow Policy: Temporally Coherent Imitation Learning via Continuous Latent Action Flow Matching for Robotic Manipulation cs.RO · 2026-01-30 · unverdicted · none · ref 17 · 2 links
CoLA-Flow Policy encodes action sequences into a continuous latent space and learns an explicit flow there, yielding near-single-step inference with up to 93.7% smoother trajectories and 25-point higher task success than raw-action flow baselines.
Building Deep Graph Predictors with Graph Imitation Learning cs.CV · 2026-01-21 · unverdicted · none · ref 2
GRAIL trains graph predictors via imitation learning by modeling generation as sequential decisions on partial graph embeddings, matching or exceeding prior methods on 18 benchmarks.
Re-Key-Free, Risky-Free: Adaptable Model Usage Control cs.CR · 2025-11-24 · unverdicted · none · ref 43
AdaLoc keeps a model locked to authorized users by confining all post-deployment updates to a chosen subset of weights, preserving both task performance for authorized use and near-random accuracy for unauthorized use across vision and language models.
EmbodiTTA: Resource-Efficient Test-Time Adaptation for Embodied Visual Systems cs.LG · 2025-05-02 · unverdicted · none · ref 28
OD-TTA enables resource-efficient test-time adaptation on edge devices by triggering updates only on detected domain shifts, achieving comparable accuracy with lower energy and computation costs for embodied visual systems.
Lowering the Barrier to IREX Participation: Open-Source Algorithms, Toolkit, and Benchmarking for Iris Recognition cs.CV · 2026-05-20 · accept · none · ref 43
Open-source neural network iris matchers (TripletIris using batch-hard triplet loss and ArcIris using ArcFace loss) plus compliant C++ implementations of HDBIF and CRYPTS are released, evaluated on IREX X and eight academic datasets, and accompanied by segmentation tools to lower entry barriers for
TAR: Text Semantic Assisted Cross-modal Image Registration Framework for Optical and SAR Images cs.CV · 2026-05-12 · unverdicted · none · ref 44
TAR uses frozen text encoders on remote sensing scene descriptions to boost high-level features for coarse-to-fine optical-SAR image registration under large deformations.
Reward-Guided Semantic Evolution for Test-time Adaptive Object Detection cs.CV · 2026-05-06 · unverdicted · none · ref 26
RGSE adapts text embeddings at test time via evolutionary search, using cosine similarity rewards from high-confidence visual proposals to improve open-vocabulary object detection under distribution shifts.
RFPrompt: Prompt-Based Expert Adaptation of the Large Wireless Model for Modulation Classification cs.LG · 2026-05-05 · unverdicted · none · ref 21
RFPrompt adapts the Large Wireless Model via deep prompt tokens to improve out-of-distribution robustness in modulation classification while training only a small number of parameters.
MSACT: Multistage Spatial Alignment for Stable Low-Latency Fine Manipulation cs.RO · 2026-05-01 · unverdicted · none · ref 7
MSACT improves localization stability and task success rates in limited-data bimanual manipulation by extracting stable 2D attention points and aligning predicted attention sequences across frames without keypoint labels.
Stereo Multistage Spatial Attention for Real-Time Mobile Manipulation Under Visual Scale Variation and Disturbances cs.RO · 2026-05-01 · unverdicted · none · ref 32
A stereo multistage spatial attention deep predictive learning system improves robustness and success rates for real-time mobile manipulation under visual scale variation and disturbances.
HFS-TriNet: A Three-Branch Collaborative Feature Learning Network for Prostate Cancer Classification from TRUS Videos cs.CV · 2026-04-24 · unverdicted · none · ref 32
HFS-TriNet applies heuristic frame selection and a three-branch network (ResNet50, SAM-based with temporal attention, WTCR) to classify prostate cancer from TRUS videos.
CODO: An Automated Compiler for Comprehensive Dataflow Optimization cs.AR · 2026-04-14 · unverdicted · none · ref 16
CODO automates comprehensive dataflow optimization on FPGAs, achieving 1.45x-4.52x speedups on kernels and up to 33.8x on DNN models over state-of-the-art frameworks.
BiasIG: Benchmarking Multi-dimensional Social Biases in Text-to-Image Models cs.CY · 2026-04-13 · conditional · none · ref 39
BiasIG is a multi-dimensional benchmark for social biases in T2I models that shows debiasing interventions frequently cause confounding discrimination effects.
Hierarchical, Interpretable, Label-Free Concept Bottleneck Model cs.CV · 2026-04-02 · unverdicted · none · ref 42
HIL-CBM is a hierarchical label-free concept bottleneck model that improves classification accuracy and explanation quality over prior single-level CBMs using a visual consistency loss and dual heads.
Learnable Quantum Efficiency Filters for Urban Hyperspectral Segmentation cs.CV · 2026-03-27 · conditional · none · ref 23
LQE is a physics-constrained learnable dimensionality reduction technique that improves average mIoU in hyperspectral urban segmentation on three datasets while using only 12-36 parameters.
FedACT: Concurrent Federated Intelligence across Heterogeneous Data Sources cs.LG · 2026-03-11 · unverdicted · none · ref 40
FedACT schedules devices across concurrent FL jobs via alignment scoring and fairness to reduce average job completion time by up to 8.3x and raise accuracy by up to 44.5% versus baselines.
Practical Quantum Federated Learning for Privacy-Sensitive Healthcare: Communication Efficiency and Noise Resilience quant-ph · 2026-03-04 · unverdicted · none · ref 16
Hybrid QFL cuts quantum transmissions from 3TNMP to {3t + 2(T-t)}NMP over T rounds while preserving near-centralized convergence and improving depolarizing-noise resilience via decentralized aggregation and Steane-code QEC.
TFusionOcc: T-Primitive Based Object-Centric Multi-Sensor Fusion Framework for 3D Occupancy Prediction cs.CV · 2026-02-06 · unverdicted · none · ref 21
TFusionOcc uses a family of Student's t-distribution T-primitives and a T-mixture model for multi-sensor 3D occupancy prediction, reporting state-of-the-art results on nuScenes.
Contrastive Heliophysical Image Pretraining for Solar Dynamics Observatory Records cs.CV · 2025-11-28 · unverdicted · none · ref 10
SolarCHIP contrastively pretrains CNN and Vision Transformer backbones on SDO AIA-HMI data with multi-granularity objectives, achieving SOTA on cross-modal translation and flare classification especially in low-resource settings.
MapRF: Weakly Supervised Online HD Map Construction via NeRF-Guided Self-Training cs.CV · 2025-11-24 · unverdicted · none · ref 50
MapRF reaches about 75% of fully supervised HD map accuracy on Argoverse 2 and nuScenes by generating view-consistent pseudo labels via a NeRF conditioned on map predictions and refining them with Map-to-Ray Matching in self-training.
X-IONet: Cross-Platform Inertial Odometry Network for Pedestrian and Legged Robot cs.RO · 2025-11-11 · unverdicted · none · ref 20
X-IONet combines rule-based platform classification with a dual-stage attention network to predict displacement and uncertainty from IMU data, then fuses outputs via EKF, achieving reported error reductions on pedestrian and quadruped datasets.
Remote Rowhammer Attack using Adversarial Observations on Federated Learning Clients cs.LG · 2025-05-09 · unverdicted · none · ref 44
A reinforcement learning attacker manipulates client sensor observations in federated learning to induce repetitive server memory updates, achieving around 70% repeated update rate and enabling remote Rowhammer bit flips on an automatic speech recognition model.
Fixed-Length Dense Fingerprint Representation with Alignment and Robust Enhancement cs.CV · 2025-05-06 · unverdicted · none · ref 49
FLARE introduces a fixed-length 3D dense fingerprint descriptor integrated with pose-based alignment and ridge enhancement for robust cross-modality matching.
Replacement Learning: Training Neural Networks with Fewer Parameters cs.CV · 2026-05-19 · unverdicted · none · ref 41
Replacement Learning replaces selected blocks in CNNs and ViTs with learnable parameter-fusion surrogates derived from adjacent layers to reduce full-depth backpropagation redundancy.
LymphNode: A Plug-and-Play Access Control Method for Deep Neural Networks cs.CR · 2026-05-15 · unverdicted · none · ref 45
LymphNode enforces default-deny access control on DNNs by injecting GSUAP into the feature space to neutralize utility for unauthorized queries and selectively restore it for authorized inputs carrying a stealthy credential, using under 100 samples from surrogate data.
iPay: Integrated Payment Action Recognition via Multimodal Networks and Adaptive Spatial Prior Learning cs.CV · 2026-05-11 · unverdicted · none · ref 28
iPay fuses RGB and skeleton expert streams via dual-attention and a prior-driven Spatial Difference Discriminator to reach 83.45% accuracy on 500+ real-world payment clips from onboard transit cameras.
Evidence-based Decision Modeling for Synthetic Face Detection with Uncertainty-driven Active Learning cs.CV · 2026-05-11 · unverdicted · none · ref 39
EMSFD uses Dirichlet-based evidence modeling to capture prediction uncertainty in synthetic face detection and applies uncertainty-driven active learning to achieve 15% higher accuracy than prior methods.
Multi-Level Bidirectional Biomimetic Learning for EEG-Based Visual Decoding cs.CV · 2026-05-06 · unverdicted · none · ref 36
MB2L achieves 80.5% top-1 and 97.6% top-5 accuracy on zero-shot EEG-to-image retrieval by using biomimetic modules and bidirectional contrastive learning to align neural and visual features.
Deep Reprogramming Distillation for Medical Foundation Models cs.CV · 2026-05-06 · unverdicted · none · ref 73
DRD introduces a reprogramming module and CKA-based distillation to enable efficient, robust adaptation of medical foundation models to downstream 2D/3D classification and segmentation tasks, outperforming prior PEFT and KD methods on 18 tasks.
FedPLT: Scalable, Resource-Efficient, and Heterogeneity-Aware Federated Learning via Partial Layer Training cs.DC · 2026-05-04 · unverdicted · none · ref 17
FedPLT assigns client-specific model layers for training and matches or beats full-model federated learning accuracy with 71-82 percent fewer trainable parameters per client.
Dual-LoRA: Parameter-Efficient Adversarial Disentanglement for Cross-Lingual Speaker Verification eess.AS · 2026-04-29 · unverdicted · none · ref 25
Dual-LoRA with a language-anchored adversary achieves 0.91% EER on the TidyVoice benchmark for cross-lingual speaker verification by targeting true linguistic cues while preserving speaker discriminability.
Meta-Ensemble Learning with Diverse Data Splits for Improved Respiratory Sound Classification cs.LG · 2026-04-27 · unverdicted · none · ref 19
Meta-ensemble learning on diverse ICBHI data splits reaches 66.49% Score and improves generalization on two external datasets.
Stylistic-STORM (ST-STORM) : Perceiving the Semantic Nature of Appearance cs.CV · 2026-04-17 · unverdicted · none · ref 20
ST-STORM introduces a dual-branch SSL framework that disentangles semantic content from stylistic appearance using gated latent streams, JEPA for content invariance, and adversarial constraints for style capture.

Deep residual learning for image recognition

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer