Deep Learning for Classical Japanese Literature

Tarin Clanuwat , Mikel Bober-Irizar , Asanobu Kitamoto , Alex Lamb , Kazuaki Yamamoto , David Ha

Authors on Pith no claims yet

classification 💻 cs.CV cs.LGstat.ML

keywords japaneselearningtasksbenchmarkclassicaldatasetdatasetsfocuses

read the original abstract

Much of machine learning research focuses on producing models which perform well on benchmark tasks, in turn improving our understanding of the challenges associated with those tasks. From the perspective of ML researchers, the content of the task itself is largely irrelevant, and thus there have increasingly been calls for benchmark tasks to more heavily focus on problems which are of social or cultural relevance. In this work, we introduce Kuzushiji-MNIST, a dataset which focuses on Kuzushiji (cursive Japanese), as well as two larger, more challenging datasets, Kuzushiji-49 and Kuzushiji-Kanji. Through these datasets, we wish to engage the machine learning community into the world of classical Japanese literature. Dataset available at https://github.com/rois-codh/kmnist

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 11 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Quantum Interval Bound Propagation for Certified Training of Quantum Neural Networks
quant-ph 2026-05 unverdicted novelty 7.0

QIBP adapts interval bound propagation to quantum neural networks for certified adversarial robustness via interval and affine arithmetic implementations.
Grokking of Diffusion Models: Case Study on Modular Addition
cs.LG 2026-04 unverdicted novelty 7.0

Diffusion models show grokking on modular addition by composing periodic operand representations in simple data regimes or by separating arithmetic computation from visual denoising across timesteps in varied regimes.
Bayesian Model Merging
cs.LG 2026-05 unverdicted novelty 6.0

Bayesian Model Merging introduces a bi-level optimization framework that merges task-specific models via closed-form Bayesian regression with an anchor prior and global hyperparameter search, outperforming baselines a...
Model Merging: Foundations and Algorithms
cs.LG 2026-05 unverdicted novelty 6.0

New cycle-consistent optimization, task vector theory, singular vector decompositions, adaptive routing, and efficient evolutionary search provide foundations for merging neural network weights across tasks.
Possibilistic Predictive Uncertainty for Deep Learning
cs.LG 2026-05 unverdicted novelty 6.0

DAPPr introduces a possibilistic framework that projects parameter posteriors to predictions via supremum and approximates them with Dirichlet possibility functions to yield efficient, closed-form epistemic uncertaint...
Efficient Mutation Testing of Quantum Machine Learning Models
quant-ph 2026-04 unverdicted novelty 6.0

New mutation operators and directed mutant generation produce more diverse faulty quantum neural network circuits than prior techniques, as shown in experiments.
Controlled Steering-Based State Preparation for Adversarial-Robust Quantum Machine Learning
quant-ph 2026-04 unverdicted novelty 6.0

A passive steering method for quantum state preparation improves adversarial accuracy in QML models by up to 40% across tested cases.
Task Alignment: A simple and effective proxy for model merging in computer vision
cs.CV 2026-04 unverdicted novelty 6.0

Task alignment serves as an efficient proxy for hyperparameter selection in model merging, accelerating the process by orders of magnitude while preserving performance in vision models with heterogeneous decoders.
Continual Distillation of Teachers from Different Domains
cs.LG 2026-04 conditional novelty 6.0

SE2D stabilizes continual distillation across heterogeneous teachers by preserving logits on external unlabeled data to mitigate unseen knowledge forgetting.
Risk-Consistent Multiclass Learning from Random Label-Subset Membership Queries
cs.LG 2026-05 unverdicted novelty 5.0

The paper introduces risk-consistent multiclass learning from random label-subset queries by deriving an unbiased risk estimator under ERM, plus non-negative and absolute-value corrections, with generalization bounds ...
Dendritic Neural Networks with Equilibrium Propagation
cs.LG 2026-05 unverdicted novelty 5.0

Dendritic EP matches standard EP on simple tasks but significantly outperforms it on KMNIST and FMNIST, and in deeper models, approaching the performance of backpropagation-trained dendritic networks.