pith. sign in

hub

Deep Learning for Classical Japanese Literature

23 Pith papers cite this work. Polarity classification is still indexing.

23 Pith papers citing it
abstract

Much of machine learning research focuses on producing models which perform well on benchmark tasks, in turn improving our understanding of the challenges associated with those tasks. From the perspective of ML researchers, the content of the task itself is largely irrelevant, and thus there have increasingly been calls for benchmark tasks to more heavily focus on problems which are of social or cultural relevance. In this work, we introduce Kuzushiji-MNIST, a dataset which focuses on Kuzushiji (cursive Japanese), as well as two larger, more challenging datasets, Kuzushiji-49 and Kuzushiji-Kanji. Through these datasets, we wish to engage the machine learning community into the world of classical Japanese literature. Dataset available at https://github.com/rois-codh/kmnist

hub tools

citation-role summary

dataset 3

citation-polarity summary

roles

dataset 3

polarities

use dataset 3

clear filters

representative citing papers

Grokking of Diffusion Models: Case Study on Modular Addition

cs.LG · 2026-04-20 · unverdicted · novelty 7.0

Diffusion models show grokking on modular addition by composing periodic operand representations in simple data regimes or by separating arithmetic computation from visual denoising across timesteps in varied regimes.

Bayesian Model Merging

cs.LG · 2026-05-13 · unverdicted · novelty 6.0

Bayesian Model Merging introduces a bi-level optimization framework that merges task-specific models via closed-form Bayesian regression with an anchor prior and global hyperparameter search, outperforming baselines and nearly matching expert averages on up to 20-task vision and 5-task language Merg

Possibilistic Predictive Uncertainty for Deep Learning

cs.LG · 2026-05-01 · unverdicted · novelty 6.0 · 2 refs

DAPPr projects a possibilistic posterior over network parameters to predictions using supremum operators and approximates it with learnable Dirichlet functions to yield an efficient training objective for epistemic uncertainty.

Model Merging: Foundations and Algorithms

cs.LG · 2026-05-02 · unverdicted · novelty 6.0

New cycle-consistent optimization, task vector theory, singular vector decompositions, adaptive routing, and efficient evolutionary search provide foundations for merging neural network weights across tasks.

Interactive Pareto navigation for deep multi-task learning

cs.LG · 2026-06-17 · unverdicted · novelty 5.0

PPE is a novel predictor-corrector method for interactive Pareto set exploration in deep multi-task learning that approximates tangent spaces via Krylov subspace iterations using only matrix-vector products from automatic differentiation.

Dendritic Neural Networks with Equilibrium Propagation

cs.LG · 2026-05-01 · unverdicted · novelty 5.0

Dendritic EP matches standard EP on simple tasks but significantly outperforms it on KMNIST and FMNIST, and in deeper models, approaching the performance of backpropagation-trained dendritic networks.

Context-Aware Multipath Networks

cs.CV · 2019-07-26 · unverdicted · novelty 4.0

CAMNet uses data-dependent routing across parallel tensors in a multi-path network to outperform equivalent single-path, multi-path, and deeper networks on classification and pixel-labeling tasks for individual, sequential, and combined datasets.

citing papers explorer

Showing 1 of 1 citing paper after filters.

  • Stimulus symmetries can confound representational similarity analyses q-bio.NC · 2026-05-20 · unverdicted · none · ref 23 · internal anchor

    Stimulus symmetries render many neural representations functionally equivalent yet produce qualitatively different RSMs, including drifting ones from SGD or regularization in image-encoding networks.