Unsupervised behavioral mode discovery combined with mutual information rewards enables RL fine-tuning of multimodal generative policies that achieves higher success rates without losing action diversity.
British Machine Vision Conference , year =
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.LG 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
ZC-Swish stabilizes deep BN-free networks by anchoring activation means near zero, preventing collapse at depths 16 and beyond where standard Swish fails.
citing papers explorer
-
Behavioral Mode Discovery for Fine-tuning Multimodal Generative Policies
Unsupervised behavioral mode discovery combined with mutual information rewards enables RL fine-tuning of multimodal generative policies that achieves higher success rates without losing action diversity.
-
ZC-Swish: Stabilizing Deep BN-Free Networks for Edge and Micro-Batch Applications
ZC-Swish stabilizes deep BN-free networks by anchoring activation means near zero, preventing collapse at depths 16 and beyond where standard Swish fails.