A new corpus of 108 mixed string-numeric tables shows that advanced tabular learners with basic string embeddings perform well on most real-world data, while large LLM encoders help on free-text heavy tables.
super hub Mixed citations
Gradient-based learning applied to document recognition
Mixed citation behavior. Most common role is background (43%).
hub tools
citation-role summary
citation-polarity summary
authors
co-cited works
representative citing papers
LineFit delivers more stable line-core intensity and Doppler velocity time series from complex multi-line solar spectra by combining adaptive windowing, asymmetric Voigt options, and split-core handling, outperforming standard fast estimators on synthetic benchmarks.
A reusable framework generates verification instances with provably known robustness labels, revealing numeric tolerance issues and bugs in five verifiers while introducing difficulty profiles to diagnose failure modes.
QLL is a novel logic for neuro-symbolic learning that uses ML-native operations (sum, log-sum-exp) on logits to embed constraints, satisfying most linear logic properties and showing stronger correlation between empirical robustness and formal verification than prior approaches.
Gradient matching empirically recovers implicit regularization effects such as l2 penalties from early stopping and dropout in neural networks.
A framework quantifies DNN complexity via tensor operations, links 40 years of breakthroughs to complexity increases, and releases a dataset of 3000+ unexplored high-complexity architectures.
Introduces Calibrated Size Ratio (CSR) and confidence-weighted metrics to better detect overconfidence risk and calibration issues beyond the limitations of ECE.
BRIDGE creates the first formal heterogeneous multi-dataset benchmark for IoT botnet detection with LODO evaluation, and TCH-Net achieves mean LODO F1 of 0.5577 while reaching F1 0.8296 on standard tests, outperforming twelve baselines.
SecurePix uses FeFET multidomain polarization states for in-pixel symmetric-key encryption, dropping ResNet-18 accuracy to 9.58% on MNIST and 6.98% on CIFAR-10 while supporting key-based decryption via lookup table.
S2-WEF detects dynamic free-riders in federated learning by simulating attack WEF patterns from prior global models, combining them with mutual deviation scores, and using two-dimensional clustering without proxy data or pre-training.
A multi-mode quantum annealing approach enables VAEs with Boltzmann priors, showing faster training and better generation than Gaussian-prior VAEs on MNIST, Fashion-MNIST, and CelebA plus improved out-of-distribution detection.
Shape- and peak-sensitive goodness functions for Forward-Forward deliver up to 72pp gains over sum-of-squares, reaching 98.2% on MNIST and 89% on Fashion-MNIST.
Langevin sampling on the modern Hopfield energy produces training-free stochastic attention that transitions from exact retrieval to generation as temperature rises, with an entropy inflection condition marking the shift.
A programmable superconducting LIF neuron with intrinsic static memory and dual-timescale plasticity achieves 45 GHz operation and femtojoule energy per spike.
Harder classification tasks produce neural representations whose accuracy collapses under binarization and shuffling while easier tasks remain robust, defining task complexity via the performance gap between full-precision and perturbed networks.
Introduces formal verification to compute certified neuron range bounds for CKKS-encrypted neural networks, eliminating overflow failures that previously reached 47%.
Derives expectation consistency condition as necessary and sufficient for calibration under covariate shift and proposes ECL loss with matching sample complexity to ECE.
GRAM is a latent-variable generative model that performs recursive reasoning via stochastic trajectories, trained with amortized variational inference to support multi-hypothesis reasoning and unconditional generation.
A diffusion model serves as the encoder in an autoencoder when trained alternately with the decoder to resolve opposing update directions while retaining the standard diffusion training objective.
LRP on EEG transformers reveals Clever Hans artifacts in motor imagery tasks and a recurring central electrode cluster as a candidate sensorimotor signature of arousal.
Instructions trigger a production-centered mechanism in language models, with task-specific information stable in input tokens but varying strongly in output tokens and correlating with behavior.
CutMix augmentation during training induces spatial locality in early layers of Vision Transformers trained from scratch, as measured by reduced Mean Attention Distance.
MS-FLOW uses a capacity-limited sparse routing mechanism to model only critical inter-variable dependencies in time series data, achieving state-of-the-art accuracy on 12 benchmarks with fewer but more reliable connections.
AuxPath-FM extends flow matching to arbitrary auxiliary distributions while preserving the continuity equation and marginal training objective.
citing papers explorer
-
STRABLE: Benchmarking Tabular Machine Learning with Strings
A new corpus of 108 mixed string-numeric tables shows that advanced tabular learners with basic string embeddings perform well on most real-world data, while large LLM encoders help on free-text heavy tables.
-
Adaptive multi-line fitting for stable line-core intensity and Doppler velocity
LineFit delivers more stable line-core intensity and Doppler velocity time series from complex multi-line solar spectra by combining adaptive windowing, asymmetric Voigt options, and split-core handling, outperforming standard fast estimators on synthetic benchmarks.
-
Stress-Testing Neural Network Verifiers with Provably Robust Instances
A reusable framework generates verification instances with provably known robustness labels, revealing numeric tolerance issues and bugs in five verifiers while introducing difficulty profiles to diagnose failure modes.
-
Quantitative Linear Logic for Neuro-Symbolic Learning and Verification
QLL is a novel logic for neuro-symbolic learning that uses ML-native operations (sum, log-sum-exp) on logits to embed constraints, satisfying most linear logic properties and showing stronger correlation between empirical robustness and formal verification than prior approaches.
-
Estimating Implicit Regularization in Deep Learning
Gradient matching empirically recovers implicit regularization effects such as l2 penalties from early stopping and dropout in neural networks.
-
On the Architectural Complexity of Neural Networks
A framework quantifies DNN complexity via tensor operations, links 40 years of breakthroughs to complexity increases, and releases a dataset of 3000+ unexplored high-complexity architectures.
-
Beyond ECE: Calibrated Size Ratio, Risk Assessment, and Confidence-Weighted Metrics
Introduces Calibrated Size Ratio (CSR) and confidence-weighted metrics to better detect overconfidence risk and calibration issues beyond the limitations of ECE.
-
BRIDGE and TCH-Net: Heterogeneous Benchmark and Multi-Branch Baseline for Cross-Domain IoT Botnet Detection
BRIDGE creates the first formal heterogeneous multi-dataset benchmark for IoT botnet detection with LODO evaluation, and TCH-Net achieves mean LODO F1 of 0.5577 while reaching F1 0.8296 on standard tests, outperforming twelve baselines.
-
Lightweight True In-Pixel Encryption with FeFET Enabled Pixel Design for Secure Imaging
SecurePix uses FeFET multidomain polarization states for in-pixel symmetric-key encryption, dropping ResNet-18 accuracy to 9.58% on MNIST and 6.98% on CIFAR-10 while supporting key-based decryption via lookup table.
-
Dynamic Free-Rider Detection in Federated Learning via Simulated Attack Patterns
S2-WEF detects dynamic free-riders in federated learning by simulating attack WEF patterns from prior global models, combining them with mutual deviation scores, and using two-dimensional clustering without proxy data or pre-training.
-
Multi-Mode Quantum Annealing for Generative Representation Learning with Boltzmann Priors
A multi-mode quantum annealing approach enables VAEs with Boltzmann priors, showing faster training and better generation than Gaussian-prior VAEs on MNIST, Fashion-MNIST, and CelebA plus improved out-of-distribution detection.
-
Selectivity and Shape in the Design of Forward-Forward Goodness Functions
Shape- and peak-sensitive goodness functions for Forward-Forward deliver up to 72pp gains over sum-of-squares, reaching 98.2% on MNIST and 89% on Fashion-MNIST.
-
Stochastic Attention via Langevin Dynamics on the Modern Hopfield Energy
Langevin sampling on the modern Hopfield energy produces training-free stochastic attention that transitions from exact retrieval to generation as temperature rises, with an entropy inflection condition marking the shift.
-
Programmable superconducting neuron with intrinsic in-memory computation and dual-timescale plasticity for ultra-efficient neuromorphic computing
A programmable superconducting LIF neuron with intrinsic static memory and dual-timescale plasticity achieves 45 GHz operation and femtojoule energy per spike.
-
Task complexity shapes internal representations and robustness in neural networks
Harder classification tasks produce neural representations whose accuracy collapses under binarization and shuffling while easier tasks remain robust, defining task complexity via the performance gap between full-precision and perturbed networks.
-
Encrypted Neural Networks without Overflows
Introduces formal verification to compute certified neuron range bounds for CKKS-encrypted neural networks, eliminating overflow failures that previously reached 47%.
-
Expectation Consistency Loss: Rethink Confidence Calibration under Covariate Shift
Derives expectation consistency condition as necessary and sufficient for calibration under covariate shift and proposes ECL loss with matching sample complexity to ECE.
-
Generative Recursive Reasoning
GRAM is a latent-variable generative model that performs recursive reasoning via stochastic trajectories, trained with amortized variational inference to support multi-hypothesis reasoning and unconditional generation.
-
The Diffusion Encoder
A diffusion model serves as the encoder in an autoencoder when trained alternately with the decoder to resolve opposing update directions while retaining the standard diffusion training objective.
-
From Clever Hans to Scientific Discovery: Interpreting EEG Foundational Transformers with LRP
LRP on EEG transformers reveals Clever Hans artifacts in motor imagery tasks and a recurring central electrode cluster as a candidate sensorimotor signature of arousal.
-
Instructions Shape Production of Language, not Processing
Instructions trigger a production-centered mechanism in language models, with task-specific information stable in input tokens but varying strongly in output tokens and correlating with behavior.
-
Inducing Spatial Locality in Vision Transformers through the Training Protocol
CutMix augmentation during training induces spatial locality in early layers of Vision Transformers trained from scratch, as measured by reduced Mean Attention Distance.
-
What If We Let Forecasting Forget? A Sparse Bottleneck for Cross-Variable Dependencies
MS-FLOW uses a capacity-limited sparse routing mechanism to model only critical inter-variable dependencies in time series data, achieving state-of-the-art accuracy on 12 benchmarks with fewer but more reliable connections.
-
Flow Matching with Arbitrary Auxiliary Paths
AuxPath-FM extends flow matching to arbitrary auxiliary distributions while preserving the continuity equation and marginal training objective.
-
P-Guide: Parameter-Efficient Prior Steering for Single-Pass CFG Inference
P-Guide achieves single-pass classifier-free guidance in flow matching by modulating the initial latent state and is equivalent to standard CFG under a first-order approximation while cutting latency by half.
-
When AI Meets Science: Research Diversity, Interdisciplinarity, Visibility, and Retractions across Disciplines in a Global Surge
AI use in science has grown exponentially since 2015 but stays confined to computer science and statistics topics, shows higher retraction rates and citations, and follows distinct global adoption patterns.
-
Calculating Domain of Attraction Boundary of Power Systems Based on the Gentlest Ascent Dynamics
Applies gentlest ascent dynamics and stable manifold methods to compute domain of attraction boundaries for stable equilibria in synchronous-generator power system models.
-
Class Angular Distortion Index for Dimensionality Reduction
CADI quantifies the preservation of relative cluster angles in low-dimensional projections using internal angles from point triples.
-
Empirical Insights of Test Selection Metrics under Multiple Testing Objectives and Distribution Shifts
A broad empirical benchmark shows how 15 existing test selection metrics perform for fault detection, performance estimation, and retraining under corrupted, adversarial, temporal, natural, and label shifts across image, text, and Android data.
-
Modulation Feature Enhancement with a Multi-Stage Attention Network for Underwater Acoustic Target Recognition
A 1-D CNN with novel multi-stage spectral attention mechanisms and adjustable class-balanced focal loss improves recognition accuracy on real ship-radiated noise datasets.
-
LTBs-KAN: Linear-Time B-splines Kolmogorov-Arnold Networks
LTBs-KAN delivers linear-time B-spline evaluation in KANs plus parameter reduction via product-of-sums factorization, with competitive results on MNIST, Fashion-MNIST, and CIFAR-10.
-
QuanForge: A Mutation Testing Framework for Quantum Neural Networks
QuanForge introduces statistical mutation killing and nine post-training mutation operators for QNNs to distinguish test suites and localize vulnerable circuit regions.
-
Efficient Adversarial Training via Criticality-Aware Fine-Tuning
CAAT selects critical parameters for adversarial robustness in ViTs and applies PEFT to tune only those, yielding a 4.3% robustness drop versus full AT while using ~6% of parameters.
-
Daily Predictions of F10.7 and F30 Solar Indices with Deep Learning
SINet outperforms five prior statistical and deep learning methods on F10.7 predictions and provides the first deep learning forecasts for the F30 solar index.
-
Extraction of linearized models from pre-trained networks via knowledge distillation
Koopman theory plus knowledge distillation yields linearized models from pre-trained nets that outperform standard least-squares Koopman approximations on MNIST and Fashion-MNIST in accuracy and stability.
-
Drifting Fields are not Conservative
Drift fields are not conservative except for Gaussian kernels; sharp normalization makes them conservative for any radial kernel by equating them to score differences of kernel density estimates.
-
ML-based approach to classification and generation of structured light propagation in turbulent media
ML models classify and generate structured light in turbulence using CNNs and diffusion models enhanced by Bregman distance minimization.
-
Deep Image Clustering Based on Curriculum Learning and Density Information
IDCL adds density-based curriculum learning and density-core guidance to deep image clustering, claiming superior robustness, faster convergence, and flexibility on benchmark datasets.
-
Realistic Handwritten Multi-Digit Writer (MDW) Number Recognition Challenges
New MDW benchmarks demonstrate that isolated digit classifiers struggle with multi-digit numbers from the same writer, necessitating task-specific metrics and advanced methods.
-
Pulse Shape Discrimination Algorithms: Survey and Benchmark
A survey and benchmark of ~60 PSD algorithms on two radiation datasets finds deep learning models (MLPs and hybrids) often outperform traditional statistical methods, with an open-source Python/MATLAB toolbox and datasets released.
-
Distributed Normal Map-based Stochastic Proximal Gradient Methods over Networks
norM-DSGT and norM-ED achieve centralized stochastic proximal-gradient rates for distributed composite objectives, with norM-ED transient time O(n^3/(1-λ)^2).
-
Representation Gap: Explaining the Unreasonable Effectiveness of Neural Networks from a Geometric Perspective
Derives an asymptotic equivalent for the Representation Gap in equivariant diffusion models, showing it depends primarily on the intrinsic dimension of the task.
-
Unveiling Hidden Lyman Alpha Emitters in the DESI DR1 Data
A CNN detects 19,685 LAEs at z=2-3.5 in DESI DR1 spectra with 95% purity and completeness.
-
Automated Classification of Plasma Regions at Mars Using Machine Learning
A convolutional neural network trained on MAVEN SWIA ion spectra reliably classifies solar wind, magnetosheath, and induced magnetosphere regions at Mars, outperforming a multilayer perceptron.
-
ASTRAFier: A Novel and Scalable Transformer-based Stellar Variability Classifier
ASTRAFier is a Transformer-BiLSTM-CNN model that classifies stellar variability from light curves, reporting 94.26% accuracy on Kepler data and 88.22% on TESS, then applied to 2.8 million TESS curves to release a catalog.
-
DistributedEstimator: Distributed Training of Quantum Neural Networks via Circuit Cutting
DistributedEstimator demonstrates that circuit cutting preserves test accuracy and robustness in QNN training on Iris and MNIST while revealing that classical reconstruction dominates runtime and exponential subcircuit growth limits scaling.
-
Agglomerative Attention
Presents agglomerative attention, a linear-complexity attention model that achieves comparable performance to full attention on language modeling tasks.
-
Joint sparse coding and temporal dynamics support context reconfiguration
Joint sparse coding and temporal dynamics in mPFC and computational networks reduce cross-context interference and enhance separability, enabling better retention in lifelong learning without extra heuristics.
-
Multi-Dataset Cross-Domain Knowledge Distillation for Unified Medical Image Segmentation, Classification, and Detection
A multi-dataset cross-domain knowledge distillation approach improves unified performance on medical image segmentation, classification, and detection by transferring domain-invariant features from a joint teacher model to task-specific students.
-
Revealing Geography-Driven Signals in Zone-Level Claim Frequency Models: An Empirical Study using Environmental and Visual Predictors
Augmenting zone-level MTPL claim frequency models with coordinates, environmental features at 5 km scale, and image embeddings improves predictive accuracy on unseen postcodes across GLM, regularized GLM, and tree-based models.