The global empirical NTK for finite-width networks has a universal Kronecker-core form that makes it structurally low-rank and biases gradient descent toward dominant modes of joint input-hidden activity.
Title resolution pending
11 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 11verdicts
UNVERDICTED 11roles
background 1polarities
background 1representative citing papers
SNMPP builds a product-form neural influence kernel from a signed interaction network over event classes and a delay-aware monotonic temporal network to enable explicit discovery of inter-event relationships alongside strong prediction.
An entropy criterion on mean representations characterises the polarised regime in VAEs and related models, with theoretical links to KL minimisation and empirical tests across several architectures.
Bifurcations cause sNTK to reduce to a dominant rank-one channel matching normal forms, collapsing effective rank and funneling gradient descent into critical dynamical directions.
SPHERE applies a Parseval penalty to MoE policies in continual RL to maintain spectral plasticity, yielding 133% and 50% higher average success on MetaWorld and HumanoidBench versus unregularized MoE baselines.
EmoMM benchmark reveals Video Contribution Collapse in MLLMs for emotion recognition under modality conflict and missingness, mitigated by CHASE head-level attention steering.
Common ID estimators fail to track the true intrinsic dimension of neural representations and are instead driven by other factors.
LLM filtering of embedding-based stock networks raises long-short Sharpe ratio from 0.742 to 0.820 and cuts max drawdown from -10.47% to -7.85% in 2011-2019 S&P 500 backtests.
Covariance-aware ridge and combined l1-l2 regularizers for neural networks yield better predictive performance and complexity control than standard penalties in simulations and applications to cooling-load prediction and leukemia classification.
The paper proposes Trajectory Regularized Merging (TRM) to enable storage-free model merging in continual learning by optimizing in an augmented trajectory subspace with task alignment, prediction consistency, and gradient responsiveness objectives, claiming SOTA results.
Logistic regression using TF-IDF features and three metadata attributes achieves 0.8028 accuracy, 0.8003 weighted F1, and 0.7276 macro F1 on three-class Indonesian sentiment classification from a 707-sample imbalanced dataset.
citing papers explorer
-
The Global Empirical NTK: Self-Referential Bias and Dimensionality of Gradient Descent Learning
The global empirical NTK for finite-width networks has a universal Kronecker-core form that makes it structurally low-rank and biases gradient descent toward dominant modes of joint input-hidden activity.
-
Structured Neural Marked Point Processes for Interpretable Event Interaction Modeling
SNMPP builds a product-form neural influence kernel from a signed interaction network over event classes and a delay-aware monotonic temporal network to enable explicit discovery of inter-event relationships alongside strong prediction.
-
Entropy-Based Characterisation of the Polarised Regime in Latent Variable Models
An entropy criterion on mean representations characterises the polarised regime in VAEs and related models, with theoretical links to KL minimisation and empirical tests across several architectures.
-
State-Space NTK Collapse Near Bifurcations
Bifurcations cause sNTK to reduce to a dominant rank-one channel matching normal forms, collapsing effective rank and funneling gradient descent into critical dynamical directions.
-
SPHERE: Mitigating the Loss of Spectral Plasticity in Mixture-of-Experts for Deep Reinforcement Learning
SPHERE applies a Parseval penalty to MoE policies in continual RL to maintain spectral plasticity, yielding 133% and 50% higher average success on MetaWorld and HumanoidBench versus unregularized MoE baselines.
-
EmoMM: Benchmarking and Steering MLLM for Multimodal Emotion Recognition under Conflict and Missingness
EmoMM benchmark reveals Video Contribution Collapse in MLLMs for emotion recognition under modality conflict and missingness, mitigated by CHASE head-level attention steering.
-
Rethinking Intrinsic Dimension Estimation in Neural Representations
Common ID estimators fail to track the true intrinsic dimension of neural representations and are instead driven by other factors.
-
Cross-Stock Predictability via LLM-Augmented Semantic Networks
LLM filtering of embedding-based stock networks raises long-short Sharpe ratio from 0.742 to 0.820 and cuts max drawdown from -10.47% to -7.85% in 2011-2019 S&P 500 backtests.
-
Adaptive Norm-Based Regularization for Neural Networks
Covariance-aware ridge and combined l1-l2 regularizers for neural networks yield better predictive performance and complexity control than standard penalties in simulations and applications to cooling-load prediction and leukemia classification.
-
Revitalizing the Beginning: Avoiding Storage Dependency for Model Merging in Continual Learning
The paper proposes Trajectory Regularized Merging (TRM) to enable storage-free model merging in continual learning by optimizing in an augmented trajectory subspace with task alignment, prediction consistency, and gradient responsiveness objectives, claiming SOTA results.
-
Hybrid TF--IDF Logistic Regression and MLP Neural Baseline for Indonesian Three-Class Sentiment Analysis on Social Media Text
Logistic regression using TF-IDF features and three metadata attributes achieves 0.8028 accuracy, 0.8003 weighted F1, and 0.7276 macro F1 on three-class Indonesian sentiment classification from a 707-sample imbalanced dataset.