Empirical tests with quad-mesh filling indicate that decision regions in modern image classifiers are simply connected.
hub
International conference on machine learning , pages=
12 Pith papers cite this work. Polarity classification is still indexing.
hub tools
years
2026 12verdicts
UNVERDICTED 12representative citing papers
OSD-IRF performs unsupervised industrial anomaly detection with a single diffusion step by evaluating anomalies in inverse residual field space under a Gaussian, delivering SOTA or competitive results with roughly 2x speedup.
The Multi-Block DC class admits polynomial-size DC decompositions for problems that require exponential size under standard DC programming and supplies explicit constructive formulations for deep ReLU networks together with convergent batch and stochastic algorithms.
A hardware prototype performs gaze estimation by optically encoding task-relevant features with a microlens array and mask, captured on a 4x4 phototransistor array and decoded by a small neural network, reaching 3.4 ms latency with competitive accuracy.
DeFakerOne integrates InternVL2 and SAM2 into a single model that achieves state-of-the-art results on 39 detection and 9 localization benchmarks for unified fake image detection and localization.
PragLocker generates function-preserving but non-portable prompts for LLM agents via code-symbol semantic anchoring followed by target-model feedback noise injection.
DyGRO-VLA is a two-stage optimization framework for cross-task scaling of Vision-Language-Action models via dynamic grouped residual optimization in RL.
Higher-resolution observations with global-average-pooling encoders improve RL performance and generalization by enabling more localized visual attention, yielding up to 28% gains over standard Impala encoders.
ISA prunes low-saliency context tokens and routes queries by sharpness to either full or 0-th order Taylor sparse attention, enabling LIVEditor to cut attention latency ~60% while beating prior video editing methods on three benchmarks.
CmIR uses causal inference to separate invariant causal representations from spurious ones in multimodal data, improving generalization under distribution shifts and noise via invariance, mutual information, and reconstruction constraints.
ESsEN is a parameter-efficient two-tower vision-language transformer that matches larger models on discriminative tasks after training end-to-end with limited data and resources.
AgriKD distills multi-level knowledge from Vision Transformers to lightweight CNNs, achieving comparable leaf disease classification accuracy with 172x fewer parameters and 18-22x faster inference.
citing papers explorer
-
Empirical Evidence for Simply Connected Decision Regions in Image Classifiers
Empirical tests with quad-mesh filling indicate that decision regions in modern image classifiers are simply connected.
-
One-Step Diffusion with Inverse Residual Fields for Unsupervised Industrial Anomaly Detection
OSD-IRF performs unsupervised industrial anomaly detection with a single diffusion step by evaluating anomalies in inverse residual field space under a Gaussian, delivering SOTA or competitive results with roughly 2x speedup.
-
The Multi-Block DC Function Class: Theory, Algorithms, and Applications
The Multi-Block DC class admits polynomial-size DC decompositions for problems that require exponential size under standard DC programming and supplies explicit constructive formulations for deep ReLU networks together with convergent batch and stochastic algorithms.
-
Low Latency Gaze Tracking via Latent Optical Sensing
A hardware prototype performs gaze estimation by optically encoding task-relevant features with a microlens array and mask, captured on a 4x4 phototransistor array and decoded by a small neural network, reaching 3.4 ms latency with competitive accuracy.
-
Venus-DeFakerOne: Unified Fake Image Detection & Localization
DeFakerOne integrates InternVL2 and SAM2 into a single model that achieves state-of-the-art results on 39 detection and 9 localization benchmarks for unified fake image detection and localization.
-
PragLocker: Protecting Agent Intellectual Property in Untrusted Deployments via Non-Portable Prompts
PragLocker generates function-preserving but non-portable prompts for LLM agents via code-symbol semantic anchoring followed by target-model feedback noise injection.
-
DyGRO-VLA: Cross-Task Scaling of Vision-Language-Action Models via Dynamic Grouped Residual Optimization
DyGRO-VLA is a two-stage optimization framework for cross-task scaling of Vision-Language-Action models via dynamic grouped residual optimization in RL.
-
Higher Resolution, Better Generalization: Unlocking Visual Scaling in Deep Reinforcement Learning
Higher-resolution observations with global-average-pooling encoders improve RL performance and generalization by enabling more localized visual attention, yielding up to 28% gains over standard Impala encoders.
-
Lightning Unified Video Editing via In-Context Sparse Attention
ISA prunes low-saliency context tokens and routes queries by sharpness to either full or 0-th order Taylor sparse attention, enabling LIVEditor to cut attention latency ~60% while beating prior video editing methods on three benchmarks.
-
Learning Invariant Modality Representation for Robust Multimodal Learning from a Causal Inference Perspective
CmIR uses causal inference to separate invariant causal representations from spurious ones in multimodal data, improving generalization under distribution shifts and noise via invariance, mutual information, and reconstruction constraints.
-
ESsEN: Training Compact Discriminative Vision-Language Transformers in a Low-Resource Setting
ESsEN is a parameter-efficient two-tower vision-language transformer that matches larger models on discriminative tasks after training end-to-end with limited data and resources.
-
AgriKD: Cross-Architecture Knowledge Distillation for Efficient Leaf Disease Classification
AgriKD distills multi-level knowledge from Vision Transformers to lightweight CNNs, achieving comparable leaf disease classification accuracy with 172x fewer parameters and 18-22x faster inference.