archive
Every paper Pith has read. Search by title, abstract, or pith.
14903 papers in cs.LG · page 9
-
Dropout creates two scaling-law classes by activation type
Dropout Universality: Scaling Laws and Optimal Scheduling at the Edge-of-Chaos
-
Feature importance keeps prototype explanation fidelity steady
Alike Parts: A Feature-Informed Approach to Local and Global Prototype Explanations
-
Transformer locates centromeres in Hi-C data across species
$\textit{BlockFormer}$ : Transformer-based inference from interaction maps
-
Dataset unifies 73k binaries with build variations and CVE history
ASSEMBLAGE-DEEPHISTORY: A Cross-Build Binary Dataset with Temporal Coverage
-
LLMs beat semantic similarity at scoring self-explanations
Exploring the Effectiveness of Using LLMs for Automated Assessment of Student Self Explanations in Programming Education
-
Text rendered on masks improves images and halves inference cost
UniVL: Unified Vision-Language Embedding for Spatially Grounded Contextual Image Generation
-
AgForce makes antibody design respond to specific antigens
AgForce Enables Antigen-conditioned Generative Antibody Design
-
Position weighting lifts AIME scores by over 1 point in distillation
When Are Teacher Tokens Reliable? Position-Weighted On-Policy Self-Distillation for Reasoning
-
Contact prediction step improves CDR design quality
ConTact: Contact-First Antibody CDR Design via Explicit Interface Reasoning
-
Amortized noise sampling cuts diffusion teacher variance 10x
Variance Reduction for Expectations with Diffusion Teachers
-
Amortized resampling yields 2-3x compute gains for diffusion teachers
Variance Reduction for Expectations with Diffusion Teachers
-
Attractors let iterative nets scale to 99% on extreme Sudoku
Equilibrium Reasoners: Learning Attractors Enables Scalable Reasoning
-
Embedding learning rate boost replicates muP transfer
Quantifying Hyperparameter Transfer and the Importance of Embedding Layer Learning Rate
-
Adapter restores evolutionary diversity to GNN antibody design
EvoStruct: Bridging Evolutionary and Structural Priors for Antibody CDR Design via Protein Language Model Adaptation
-
Symmetry match lifts velocity reconstruction accuracy 35%
Velocityformer: Broken-Symmetry-Matched Equivariant Graph Transformers for Cosmological Velocity Reconstruction
-
Platform lets humans and AIs co-author and iterate on papers
AiraXiv: An AI-Driven Open-Access Platform for Human and AI Scientists
-
Learnable graphs can replace fixed schemas for relational deep learning
Is Fixing Schema Graphs Necessary? Full-Resolution Graph Structure Learning for Relational Deep Learning
-
JIT compilation speeds web agents by 10 times
Agent JIT Compilation for Latency-Optimizing Web Agent Planning and Scheduling
-
Rank-1 line from first 50 steps matches full RLVR at 15% cost
You Only Need Minimal RLVR Training: Extrapolating LLMs via Rank-1 Trajectories
-
DelTA raises math scores by over 3 points on 8B models
DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards
-
ML weights GNSS signals to cut urban positioning errors
A Machine Learning Framework for Weighted Least Squares GNSS Positioning based on Activation Functions
-
Randomization fixes simulator shift but reachability gaps persist
Mind the Sim-to-Real Gap & Think Like a Scientist
-
Rubric embeddings cut disparities in admissions models
Mitigating Label Bias with Interpretable Rubric Embeddings
-
Deeper networks approximate structured functions with fewer parameters
Approximation Theory for Neural Networks: Old and New
-
Fitzhugh-Nagumo networks admit equilibrium propagation via self-adjoint operators
Equilibrium Propagation and Hamiltonian Inference in the Diffusive Fitzhugh-Nagumo Model
-
PyTorch library matches specialized tools in LLM tuning
torchtune: PyTorch native post-training library
-
Per-cell dispersion cuts tail forecast error 12.5 percent
Neural Negative Binomial Regression for Weekly Seismicity Forecasting: Per-Cell Dispersion Estimation and Tail Risk Assessment
-
New Laplacian lets GNNs keep Gaussian means and covariances intact
Gaussian Sheaf Neural Networks
-
Blind agents rotate Baoding balls 13 times in 10 seconds
roto 2.0: The Robot Tactile Olympiad
-
Presents new structural results and pairwise improper-learning frameworks for…
Polynomial-Time Robust Multiclass Linear Classification under Gaussian Marginals
-
Channel-wise repair boosts 90% sparse ResNet accuracy to 55.6%
Adaptive Signal Resuscitation: Channel-wise Post-Pruning Repair for Sparse Vision Networks
-
Preference weighting improves data selection for LLM fine-tuning
PRISM: Preference-Aware Influence Function Based Data Selection Method for Efficient Fine-Tuning
-
One embedding predicts conditions and retrieves precedents
HiRes: Inspectable Precedent Memory for Reaction Condition Recommendation
-
Gossip-based critic sharing lifts multi-cell OFDMA sum-rates in 6G
FedCritic: Serverless Federated Critic Learning-based Resource Allocation for Multi-Cell OFDMA in 6G
-
CKD models perfect internally fail on new data
Calibration, Uncertainty Communication, and Deployment Readiness in CKD Risk Prediction: A Framework Evaluation Study
-
Curriculum learning cuts modality imbalance in emotion chats
Leveraging Self-Paced Curriculum Learning for Enhanced Modality Balance in Multimodal Conversational Emotion Recognition
-
LLM agent benchmarks disclose only 38 percent of evaluation details
What Twelve LLM Agent Benchmark Papers Disclose About Themselves: A Pilot Audit and an Open Scoring Schema
-
Models converge without recovering main latent factors
Memorisation, convergence and generalisation in generative models
-
One foundation model to run all 6G tasks autonomously
Towards Resilient and Autonomous Networks: A BlueSky Vision on AI-Native 6G
-
Transport maps to PDE measures are Hölder continuous
On the Regularity and Generalization of One-Step Wasserstein-guided Generative Models for PDE-Induced Measures
-
One model shifts image restoration from precise to creative
Disentangling Generation and Regression in Stochastic Interpolants for Controllable Image Restoration
-
Simulation feedback picks best synthetic scenes for driving models
Closed Loop Dynamic Driving Data Mixture for Real-Synthetic Co-Training
-
Personalised method raises iron deficiency prediction at two clinics
Embedding-Based Federated Learning with Runtime Governance for Iron Deficiency Prediction
-
CNNs classify six PD source types under switching voltage at 96% accuracy
Classification of Single and Mixed Partial Discharges under Switching Voltage Using an AWA-CNN Framework
-
Multi-agent reports raise LLM scaffold performance by 30 points
Insights Generator: Systematic Corpus-Level Trace Diagnostics for LLM Agents
-
PDE residual selects training data to cut neural operator costs
Data-Efficient Neural Operator Training via Physics-Based Active Learning
-
Multi-agent system turns full LLM traces into evidence-backed insights
Insights Generator: Systematic Corpus-Level Trace Diagnostics for LLM Agents
-
Debiasing fixes bias in bilevel hypergradients
Semiparametric Efficient Bilevel Gradient Estimation
-
43M-paper graph gives AI agents deterministic cross-field links
SciAtlas: A Large-Scale Knowledge Graph for Automated Scientific Research
-
Delta-Rule linear transformers gain up to 4.3× speed on NPUs
Fast and Stable Triangular Inversion for Delta-Rule Linear Transformers