PAPT uses adversarial prompt tuning on diffusion models to generate domain-style images while preserving category features, claiming superior single-domain generalization performance.
Augmix: A simple data processing method to improve robustness and uncertainty.arXiv preprint arXiv:1912.02781, 2019
14 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 14representative citing papers
Lens adapts camera sensors in real time via the VisiT confidence-based quality indicator to improve vision model accuracy on domain-shifted images, shown on ImageNet-ES and a new diverse benchmark.
RBFN projection heads serve as competitive replacements for MLP heads in SSL and enable SNS, a label-free metric from RBF parameters that correlates strongly with logistic regression evaluation.
Infinite-width MLPs implement a nearest-class-mean prototype classifier as their leading-order decision rule under heavy attribute noise, explaining observed robustness in experiments.
ReSAGE-PAR adapts diffusion models with LoRA, scores generated images via vision-language prompts, and applies Bayesian classification to produce pseudo-labels, yielding up to 8.7% gains when used to expand PAR datasets.
DFDNet disentangles content from style via dual modules to boost fine-grained OOD detection performance on multiple datasets.
FDDet is a semi-supervised object detection framework with BBoxMixUp and CGPC that outperforms standard detectors on the new FDD-48 food defect dataset under data-limited real-world conditions.
HTAF is a sigmoid-tanh composite that approximates the Heaviside function to allow stable gradient training of binary activation networks, yielding ICBMs with stable discretization and competitive performance on image tasks.
TINS improves OOD detection by learning negative semantics at test time with ID-prototype separation, cutting average FPR95 from 14.04% to 6.72% on the Four-OOD benchmark with ImageNet-1K.
MedMSA framework retrieves knowledge via language models then builds formal probabilistic models to produce uncertainty-weighted differential diagnoses from symptoms.
Agentic AI systems are required to overcome the parameter coverage ceiling that prevents foundation models from handling certain out-of-distribution cases.
WRF4CIR uses weight-regularized fine-tuning with adversarial perturbations to mitigate overfitting in composed image retrieval and narrows the generalization gap on benchmarks.
A patch-augmented cross-view regularization method reduces backdoor attack success rates in multimodal LLMs by enforcing output differences between original and perturbed views while using entropy constraints to preserve benign generation quality.
Merging CK+, FER+, and KDEF datasets with online/offline augmentation and random weighted sampling enables a deep CNN to classify seven facial emotions at 82% accuracy.
citing papers explorer
-
Adversarial Domain Prompt Tuning and Generation for Single Domain Generalization
PAPT uses adversarial prompt tuning on diffusion models to generate domain-style images while preserving category features, claiming superior single-domain generalization performance.
-
Adaptive Camera Sensor for Vision Models
Lens adapts camera sensors in real time via the VisiT confidence-based quality indicator to improve vision model accuracy on domain-shifted images, shown on ImageNet-ES and a new diverse benchmark.
-
Radial Basis Function Networks as Projection Heads in Self-Supervised Learning
RBFN projection heads serve as competitive replacements for MLP heads in SSL and enable SNS, a label-free metric from RBF parameters that correlates strongly with logistic regression evaluation.
-
Learning from almost nothing: How neural networks survive heavy input corruption
Infinite-width MLPs implement a nearest-class-mean prototype classifier as their leading-order decision rule under heavy attribute noise, explaining observed robustness in experiments.
-
ReSAGE-PAR: Representational Similarity Assessment for Generative Expansion in Pedestrian Attribute Recognition
ReSAGE-PAR adapts diffusion models with LoRA, scores generated images via vision-language prompts, and applies Bayesian classification to produce pseudo-labels, yielding up to 8.7% gains when used to expand PAR datasets.
-
Dual Feature Decoupling for Fine-Grained OOD Detection
DFDNet disentangles content from style via dual modules to boost fine-grained OOD detection performance on multiple datasets.
-
FDDet: Achieving Data-Efficient Food Defect Detection Under Real-World Scenarios
FDDet is a semi-supervised object detection framework with BBoxMixUp and CGPC that outperforms standard detectors on the new FDD-48 food defect dataset under data-limited real-world conditions.
-
A Composite Activation Function for Learning Stable Binary Representations
HTAF is a sigmoid-tanh composite that approximates the Heaviside function to allow stable gradient training of binary activation networks, yielding ICBMs with stable discretization and competitive performance on image tasks.
-
TINS: Test-time ID-prototype-separated Negative Semantics Learning for OOD Detection
TINS improves OOD detection by learning negative semantics at test time with ID-prototype separation, cutting average FPR95 from 14.04% to 6.72% on the Four-OOD benchmark with ImageNet-1K.
-
Medical Model Synthesis Architectures: A Case Study
MedMSA framework retrieves knowledge via language models then builds formal probabilistic models to produce uncertainty-weighted differential diagnoses from symptoms.
-
Agentic AIs Are the Missing Paradigm for Out-of-Distribution Generalization in Foundation Models
Agentic AI systems are required to overcome the parameter coverage ceiling that prevents foundation models from handling certain out-of-distribution cases.
-
WRF4CIR: Weight-Regularized Fine-Tuning Network for Composed Image Retrieval
WRF4CIR uses weight-regularized fine-tuning with adversarial perturbations to mitigate overfitting in composed image retrieval and narrows the generalization gap on benchmarks.
-
A Patch-based Cross-view Regularized Framework for Backdoor Defense in Multimodal Large Language Models
A patch-augmented cross-view regularization method reduces backdoor attack success rates in multimodal LLMs by enforcing output differences between original and perturbed views while using entropy constraints to preserve benign generation quality.
-
Improving Facial Emotion Recognition through Dataset Merging and Balanced Training Strategies
Merging CK+, FER+, and KDEF datasets with online/offline augmentation and random weighted sampling enables a deep CNN to classify seven facial emotions at 82% accuracy.