HS-FNO lifts the state to include history and decomposes updates into a learned future-slice predictor plus an exact shift-append transport, yielding lower rollout errors than standard or lag-stack FNO baselines on five non-Markovian PDE families.
super hub Canonical reference
Long short -term memory
Canonical reference. 74% of citing Pith papers cite this work as background.
hub tools
citation-role summary
citation-polarity summary
authors
co-cited works
representative citing papers
RAVEN proposes a regime-aware MoE architecture with cumulative importance thresholding and correlation-aware weighting to adaptively select temporal context for non-stationary financial forecasting.
ConTex learns a global intervention strategy via a decomposed temporal-conditional encoder architecture to generate consistent, sparse counterfactuals for time series models in a single forward pass.
Introduces the binning semiring and causal graphical models to show that correlational evaluation of learnability in formal language tasks leads to incorrect conclusions from confounders.
RESCAST-100K is a large-scale benchmark dataset of simulated and real residential energy data for cross-domain load and temperature forecasting.
Repetition rate mismatch between small-scale proxies and target budgets is the main reason data mixture experiments do not scale; a subsampling procedure that equalizes repetition rates recovers optimal mixtures from 1/16-scale experiments.
Introduces a continuous injective embedding for Log-NCDEs that builds log-signatures from data increments without interpolation or imputation while preserving compact-set universality.
PaperFit uses rendered page images in a closed loop to diagnose and repair typesetting defects in LaTeX documents, outperforming baselines on a new benchmark of 200 papers.
LG-CoTrain, an LLM-guided co-training method, outperforms classical semi-supervised baselines for crisis tweet classification in low-resource settings with 5-25 labeled examples per class.
SGC-RML creates an 8D symptom atlas from multimodal PD data and integrates conformal calibration to deliver reliable, rejectable longitudinal assessments.
BadmintonGRF is a new public multimodal dataset and benchmark that pairs multi-view video with instrumented GRF for markerless load estimation in badminton.
Adding temporal memory via LIF, precision-weighted gating, and anticipatory prediction to MoE routers recovers effective expert selection at distribution transitions, with ablation confirming a super-additive beta-ant interaction.
AsmRAG detects malware at 96% F1 and attributes families at 95% F1 by retrieving functionally similar assembly code via LLM embeddings and density-weighted anchor selection, remaining robust to metamorphic obfuscation.
Preconditioned delta-rule models with a diagonal curvature approximation improve upon standard DeltaNet, GDN, and KDA by better approximating the test-time regression objective.
BRIDGE creates the first formal heterogeneous multi-dataset benchmark for IoT botnet detection with LODO evaluation, and TCH-Net achieves mean LODO F1 of 0.5577 while reaching F1 0.8296 on standard tests, outperforming twelve baselines.
FactorEngine mines alpha factors as Turing-complete code via LLM-guided directional search, parameter separation, and a multi-agent pipeline that converts financial reports into executable programs, delivering higher IC/ICIR and Sharpe ratios than baselines in backtests.
Koopman autoencoders with forcings and temporal unrolling deliver accurate year-long predictions for coastal-ocean models at 300-1400x speedup, outperforming POD in two of three cases.
Temporal Graph Networks combine memory modules and graph operators to learn on dynamic graphs as timed event sequences, outperforming prior methods on transductive and inductive tasks while unifying earlier models as special cases.
BERT stores relational knowledge extractable via cloze queries without fine-tuning and matches supervised baselines on open-domain QA tasks.
Mixed precision training uses FP16 for most computations, FP32 master weights for accumulation, and loss scaling to enable accurate training of large DNNs with halved memory usage.
Characterizes an estimation-prediction tradeoff in binary logistic models for causal probabilistic temporal graphs and proposes a framework to jointly evaluate temporal link prediction with causal parameter recovery via Cramér-Rao bounds.
Low-bit post-training quantization of reasoning LLMs increases reasoning token counts while preserving accuracy, introducing a hidden test-time compute cost.
ViViT model predicts full viscoelastic droplet impact dynamics from initial 10-20% of VOF simulation data, reducing cost by 80-90% while capturing spreading and bouncing regimes.
Proposes feature splitting and a closed-form bound on extrapolation range to enable zero-shot topological out-of-domain generalization in dynamical systems reconstruction across tipping points.
citing papers explorer
-
PaperFit: Vision-in-the-Loop Typesetting Optimization for Scientific Documents
PaperFit uses rendered page images in a closed loop to diagnose and repair typesetting defects in LaTeX documents, outperforming baselines on a new benchmark of 200 papers.
-
SGC-RML: A reliable and interpretable longitudinal assessment for PD in real-world DNS
SGC-RML creates an 8D symptom atlas from multimodal PD data and integrates conformal calibration to deliver reliable, rejectable longitudinal assessments.
-
Fast MoE Inference via Predictive Prefetching and Expert Replication
Dynamic replication of predicted overloaded experts in MoE models achieves near-100% GPU utilization and up to 3x faster inference while retaining 90-95% of baseline performance.
-
When AI Meets Science: Research Diversity, Interdisciplinarity, Visibility, and Retractions across Disciplines in a Global Surge
AI use in science has grown exponentially since 2015 but stays confined to computer science and statistics topics, shows higher retraction rates and citations, and follows distinct global adoption patterns.
-
Hybrid Machine Learning and Physical Modeling of Feedstock Deformation During Robotic 3D Printing of Continuous Fiber Thermoplastic Composites
A hybrid Kelvin-Voigt viscoelastic and stabilized neural ODE model, identified from DMA and DSC experiments, predicts composite prepreg deformation in robotic 3D printing and generalizes beyond training temperatures.
-
ACT: Anti-Crosstalk Learning for Cross-Sectional Stock Ranking via Temporal Disentanglement and Structural Purification
ACT disentangles temporal scales in stock sequences and purifies structural relations in graphs to achieve state-of-the-art cross-sectional stock ranking on CSI300 and CSI500 with up to 74.25% improvement.
-
Daily Predictions of F10.7 and F30 Solar Indices with Deep Learning
SINet outperforms five prior statistical and deep learning methods on F10.7 predictions and provides the first deep learning forecasts for the F30 solar index.
-
Time-Warping Recurrent Neural Networks for Transfer Learning
Time-warping enables RNN transfer learning across time scales in physical systems by rescaling time in pretrained LSTMs, matching accuracy of other methods with minimal parameter changes.
-
Gated Memory Policy
GMP selectively activates and represents memory via a gate and lightweight cross-attention, yielding 30.1% higher success on non-Markovian robotic tasks while staying competitive on Markovian ones.
-
Neural Network-Based Virtual Wheel-Speed Sensor for Enhanced Low-Velocity State Estimation
A neural network fuses wheel and motor speed signals to cut wheel-speed estimation error by up to 85% versus the production sensor on real Volkswagen ID.7 data.
-
A Resource-Efficient Hybrid CNN-LSTM network for image-based bean leaf disease classification
A lightweight hybrid CNN-LSTM network classifies bean leaf diseases at 94.38% accuracy and 1.86 MB size on the ibean dataset, with reported state-of-the-art F1 scores using EfficientNet-B7+LSTM.
-
A Proof-of-Concept Simulation-Driven Digital Twin Framework for Decision-Aware Diabetes Modeling
A simulation-driven digital twin framework is shown to generate interpretable diabetes trajectories for decision-aware analysis by combining benchmark data with controlled synthetic scenarios.
-
Multilevel neural networks with dual-stage feature fusion for human activity recognition
Multilevel CNN-LSTM architectures using both late and intermediate feature fusion achieve higher accuracy in human activity recognition than late fusion alone on two benchmark datasets.
-
Deep Learning for Sequential Decision Making under Uncertainty: Foundations, Frameworks, and Frontiers
A tutorial framing deep learning as a complement to optimization for sequential decision-making under uncertainty, with applications in supply chains, healthcare, and energy.
- MinMax Recurrent Neural Cascades