Medical MLLMs degrade on image classification due to four failure modes in visual representation quality, connector projection fidelity, LLM comprehension, and semantic mapping alignment, quantified by feature probing on 14 models across 3 datasets.
Title resolution pending
6 Pith papers cite this work. Polarity classification is still indexing.
years
2026 6representative citing papers
UG-Separation framework disentangles user-side and item-side flows in TokenMixer dense-interaction models to enable reusable user computations, cutting inference latency up to 20% in ByteDance production scenarios.
A hybrid KAN-MLP model for IMU-based human activity recognition achieves 5.33% relative macro F1 improvement over pure MLPs on eight datasets by placing KANs at input embedding and classification stages.
GenHAR generalizes cross-domain human activity recognition by 9.97% accuracy and 6.4x lower FLOPs via tokenized sensor data, frequency channel correlations, selective masking, and efficient attention, with deployment detecting 2.15 billion activities.
ALAS disentangles environment and self-state streams via bio-inspired modules to deliver 23% higher subtask success and 29% better execution efficiency on long-horizon HSI tasks.
SSR uses static random filters and iterative competitive sparse mechanisms to explicitly enforce sparsity in recommendation models, outperforming dense baselines on public and billion-scale industrial datasets.
citing papers explorer
-
Lost in the Hype: Revealing and Dissecting the Performance Degradation of Medical Multimodal Large Language Models in Image Classification
Medical MLLMs degrade on image classification due to four failure modes in visual representation quality, connector projection fidelity, LLM comprehension, and semantic mapping alignment, quantified by feature probing on 14 models across 3 datasets.
-
Compute Only Once: UG-Separation for Efficient Large Recommendation Models
UG-Separation framework disentangles user-side and item-side flows in TokenMixer dense-interaction models to enable reusable user computations, cutting inference latency up to 20% in ByteDance production scenarios.
-
KAN-MLP-Mixer: A comprehensive investigation of the usage of Kolmogorov-Arnold Networks (KANs) for improving IMU-based Human Activity Recognition
A hybrid KAN-MLP model for IMU-based human activity recognition achieves 5.33% relative macro F1 improvement over pure MLPs on eight datasets by placing KANs at input embedding and classification stages.
-
GenHAR: Generalizing Cross-domain Human Activity Recognition for Last-mile Delivery
GenHAR generalizes cross-domain human activity recognition by 9.97% accuracy and 6.4x lower FLOPs via tokenized sensor data, frequency channel correlations, selective masking, and efficient attention, with deployment detecting 2.15 billion activities.
-
ALAS: Adaptive Long-Horizon Action Synthesis via Async-pathway Stream Disentanglement
ALAS disentangles environment and self-state streams via bio-inspired modules to deliver 23% higher subtask success and 29% better execution efficiency on long-horizon HSI tasks.
-
Beyond Dense Connectivity: Explicit Sparsity for Scalable Recommendation
SSR uses static random filters and iterative competitive sparse mechanisms to explicitly enforce sparsity in recommendation models, outperforming dense baselines on public and billion-scale industrial datasets.