The authors identify a Golden Partition Zone based on an intra-class variance shift in entropy bounds that enables intrinsic model inversion resistance when partitioning neural networks for collaborative inference.
hub
Deep Learning for Classical Japanese Literature
23 Pith papers cite this work. Polarity classification is still indexing.
abstract
Much of machine learning research focuses on producing models which perform well on benchmark tasks, in turn improving our understanding of the challenges associated with those tasks. From the perspective of ML researchers, the content of the task itself is largely irrelevant, and thus there have increasingly been calls for benchmark tasks to more heavily focus on problems which are of social or cultural relevance. In this work, we introduce Kuzushiji-MNIST, a dataset which focuses on Kuzushiji (cursive Japanese), as well as two larger, more challenging datasets, Kuzushiji-49 and Kuzushiji-Kanji. Through these datasets, we wish to engage the machine learning community into the world of classical Japanese literature. Dataset available at https://github.com/rois-codh/kmnist
hub tools
citation-role summary
citation-polarity summary
roles
dataset 3polarities
use dataset 3representative citing papers
QIBP adapts interval bound propagation to quantum neural networks for certified adversarial robustness via interval and affine arithmetic implementations.
Diffusion models show grokking on modular addition by composing periodic operand representations in simple data regimes or by separating arithmetic computation from visual denoising across timesteps in varied regimes.
Defines saturation index S(K) = erank(Σ̂_W^(K))/K that identifies when linear discriminant stabilizes in binary few-shot classification, with empirical phase diagram and stopping-rule AUC of 0.752 on 17 tasks.
Infinite-width MLPs implement a nearest-class-mean prototype classifier as their leading-order decision rule under heavy attribute noise, explaining observed robustness in experiments.
ODE-M formulates continual model merging as a barrier-aware ODE trajectory in parameter space, using first-order feedback and a utility-aware schedule to balance retained knowledge and new task performance.
E-PMQ improves 4-bit quantization accuracy on merged models by 8-42 points across CLIP and GLUE tasks through expert-guided calibration and merged-weight anchoring.
Bayesian Model Merging introduces a bi-level optimization framework that merges task-specific models via closed-form Bayesian regression with an anchor prior and global hyperparameter search, outperforming baselines and nearly matching expert averages on up to 20-task vision and 5-task language Merg
DAPPr projects a possibilistic posterior over network parameters to predictions using supremum operators and approximates it with learnable Dirichlet functions to yield an efficient training objective for epistemic uncertainty.
ACE-Merging estimates task input covariances from parameter differences to enable closed-form data-free merging that reduces interference and outperforms prior baselines on vision and language tasks.
New MDW benchmarks demonstrate that isolated digit classifiers struggle with multi-digit numbers from the same writer, necessitating task-specific metrics and advanced methods.
Hardness-Based Resampling reduces class recall gaps in balanced datasets by up to 32% on CIFAR-10 and 16% on CIFAR-100 by prioritizing harder samples over random or frequency-based selection.
New cycle-consistent optimization, task vector theory, singular vector decompositions, adaptive routing, and efficient evolutionary search provide foundations for merging neural network weights across tasks.
New mutation operators and directed mutant generation produce more diverse faulty quantum neural network circuits than prior techniques, as shown in experiments.
A passive steering method for quantum state preparation improves adversarial accuracy in QML models by up to 40% across tested cases.
Task alignment serves as an efficient proxy for hyperparameter selection in model merging, accelerating the process by orders of magnitude while preserving performance in vision models with heterogeneous decoders.
SE2D stabilizes continual distillation across heterogeneous teachers by preserving logits on external unlabeled data to mitigate unseen knowledge forgetting.
PPE is a novel predictor-corrector method for interactive Pareto set exploration in deep multi-task learning that approximates tangent spaces via Krylov subspace iterations using only matrix-vector products from automatic differentiation.
AML outperforms cross-validated baselines including CNNs on 50-2000 example image datasets and is comparable to XGBoost/LightGBM on tabular data using only training data and no task-dependent hyperparameters.
Stimulus symmetries render many neural representations functionally equivalent yet produce qualitatively different RSMs, including drifting ones from SGD or regularization in image-encoding networks.
The paper introduces risk-consistent multiclass learning from random label-subset queries by deriving an unbiased risk estimator under ERM, plus non-negative and absolute-value corrections, with generalization bounds and consistency results.
Dendritic EP matches standard EP on simple tasks but significantly outperforms it on KMNIST and FMNIST, and in deeper models, approaching the performance of backpropagation-trained dendritic networks.
CAMNet uses data-dependent routing across parallel tensors in a multi-path network to outperform equivalent single-path, multi-path, and deeper networks on classification and pixel-labeling tasks for individual, sequential, and combined datasets.
citing papers explorer
-
Stimulus symmetries can confound representational similarity analyses
Stimulus symmetries render many neural representations functionally equivalent yet produce qualitatively different RSMs, including drifting ones from SGD or regularization in image-encoding networks.