CAML meta-learns a progressively refined inductive bias from active-learning queries to improve robustness to spurious correlations, reporting accuracy gains on minority groups across several benchmarks.
Canonical reference
Deep learning,
Canonical reference. 100% of citing Pith papers cite this work as background.
citation-role summary
citation-polarity summary
authors
co-cited works
fields
cs.LG 17 cs.CV 7 eess.SP 3 astro-ph.SR 2 cs.CL 2 math.OC 2 physics.ao-ph 2 physics.chem-ph 2 physics.comp-ph 2 quant-ph 2polarities
background 14representative citing papers
Symmetrization of multi-class losses produces a unique convex symmetric loss that locally approximates others and supports robust neural training under label noise.
In a combinatorial toy setting, winning lottery tickets preserve families of compatible feature locations in early feature space that balance proximity to final codes with low interference, rather than specific weight subnetworks.
DualTCN is the first deep-learning model for time-domain marine CSEM inversion that regresses four earth parameters, achieves high accuracy on simulated data, and runs up to 21,000 times faster than classical optimizers.
Broximal Alignment is a novel condition under which the Ball Proximal Point Method converges to global minima in non-convex settings, generalizing quasiconvexity, star convexity, and related frameworks.
Neural decompositionality is defined via decision-boundary semantic preservation, and language transformers largely satisfy it under SAVED while vision models often do not.
Quantum circuits for coherent multilayer neural network inference achieve quadratic to polylogarithmic speedups over classical methods depending on quantum data access models for inputs and weights.
A machine learning model called neural quantum propagator is introduced to efficiently solve non-Markovian quantum dynamics described by HEOM and applied to simulate spectra of the FMO complex.
Pilot study uses pretrained video encoder features from lung ultrasound to predict 30-day CHF readmission, finding lower-lung views and temporal differences most informative with top MLP F1 of 0.80.
A Bayesian CNN maps 2D slitless spectral images to redshift estimates with NMAD precision 0.0104 for SNR_GI >=1 and better for brighter sources, while remaining robust to wavelength calibration errors via spatial augmentations.
Scaling vision models by depth and parameter count does not consistently improve localisation-based explanation quality across architectures, datasets, and post-hoc methods; smaller models often perform comparably or better.
Mixing real UAV imagery with 2101 AI-generated image-mask pairs improves semantic segmentation F1 scores for fine-grained forest species by over 15 percentage points overall and up to 30 points for rare classes.
MetaRL pre-trained on GBWM problems delivers near-optimal dynamic strategies in 0.01s achieving 97.8% of DP optimal utility and handles larger problems where DP fails.
Lottery BP adds randomness to belief propagation decoding and uses syndrome voting to achieve far higher accuracy on topological quantum codes while reducing reliance on expensive global decoders.
TSMNet uses a dual-branch text encoder and text-guided fusion module to integrate scene-level semantic and object-level label features from text with visual embeddings, achieving superior open-vocabulary segmentation on new multimodal remote sensing datasets.
Mistake-gated plasticity reduces neural network updates by 50-80% by gating changes on classification errors, improving efficiency for continual learning without added hyperparameters.
Koopman theory plus knowledge distillation yields linearized models from pre-trained nets that outperform standard least-squares Koopman approximations on MNIST and Fashion-MNIST in accuracy and stability.
Dense ReLU networks under natural weight and dimension constraints fail to approximate certain Lipschitz functions, unlike unrestricted networks.
iPDB adds a predict operator and semantic query optimizations to SQL so that LLM and ML calls run efficiently inside the database, delivering 2.5x average and up to 30x speedup over prior systems.
Complex multimodal architectures do not reliably outperform unimodal baselines or a simple multimodal baseline under standardized evaluation.
HOLE applies persistent homology to latent embeddings in neural networks and uses visualizations such as cluster flow diagrams to reveal patterns of class separation, feature disentanglement, and robustness.
A survey and benchmark of ~60 PSD algorithms on two radiation datasets finds deep learning models (MLPs and hybrids) often outperform traditional statistical methods, with an open-source Python/MATLAB toolbox and datasets released.
BERT activations show strongest correlation with MEG data for simple sentences; DNN representations generate synthetic brain data that improves stimuli decoding accuracy.
A MATLAB/ONNX testbed integrates the Pangu AI model with PID closed-loop control to perform single-input single-output perturbation-response experiments on typhoon track and intensity.
citing papers explorer
-
Cumulative Meta-Learning from Active Learning Queries for Robustness to Spurious Correlations
CAML meta-learns a progressively refined inductive bias from active-learning queries to improve robustness to spurious correlations, reporting accuracy gains on minority groups across several benchmarks.
-
Symmetrization of Loss Functions for Robust Training of Neural Networks in the Presence of Noisy Labels
Symmetrization of multi-class losses produces a unique convex symmetric loss that locally approximates others and supports robust neural training under label noise.
-
Toy Combinatorial Interpretability Models Reveal Lottery Tickets in Early Feature Space
In a combinatorial toy setting, winning lottery tickets preserve families of compatible feature locations in early feature space that balance proximity to final codes with low interference, rather than specific weight subnetworks.
-
DualTCN: A Physics-Constrained Temporal Convolutional Network for 2 Time-Domain Marine CSEM Inversion
DualTCN is the first deep-learning model for time-domain marine CSEM inversion that regresses four earth parameters, achieves high accuracy on simulated data, and runs up to 21,000 times faster than classical optimizers.
-
Broximal Alignment for Global Non-Convex Optimization
Broximal Alignment is a novel condition under which the Ball Proximal Point Method converges to global minima in non-convex settings, generalizing quasiconvexity, star convexity, and related frameworks.
-
On the Decompositionality of Neural Networks
Neural decompositionality is defined via decision-boundary semantic preservation, and language transformers largely satisfy it under SAVED while vision models often do not.
-
Accelerating Inference for Multilayer Neural Networks with Quantum Computers
Quantum circuits for coherent multilayer neural network inference achieve quadratic to polylogarithmic speedups over classical methods depending on quantum data access models for inputs and weights.
-
Non-markovian neural quantum propagator and its application to the simulation of ultrafast nonlinear spectra
A machine learning model called neural quantum propagator is introduced to efficiently solve non-Markovian quantum dynamics described by HEOM and applied to simulate spectra of the FMO complex.
-
Prognostic Value of Lung Ultrasound Biomarkers for Readmission Risk in Congestive Heart Failure: A Pilot Data-Driven Analysis
Pilot study uses pretrained video encoder features from lung ultrasound to predict 30-day CHF readmission, finding lower-lung views and temporal differences most informative with top MLP F1 of 0.80.
-
Extracting redshifts from 2D slitless spectroscopic images using deep learning for the CSST galaxy survey
A Bayesian CNN maps 2D slitless spectral images to redshift estimates with NMAD precision 0.0104 for SNR_GI >=1 and better for brighter sources, while remaining robust to wavelength calibration errors via spatial augmentations.
-
Scaling Vision Models Does Not Consistently Improve Localisation-Based Explanation Quality
Scaling vision models by depth and parameter count does not consistently improve localisation-based explanation quality across architectures, datasets, and post-hoc methods; smaller models often perform comparably or better.
-
Leveraging Image Generators to Address Training Data Scarcity: The Gen4Regen Dataset for Forest Regeneration Mapping
Mixing real UAV imagery with 2101 AI-generated image-mask pairs improves semantic segmentation F1 scores for fine-grained forest species by over 15 percentage points overall and up to 30 points for rare classes.
-
A Meta Reinforcement Learning Approach to Goals-Based Wealth Management
MetaRL pre-trained on GBWM problems delivers near-optimal dynamic strategies in 0.01s achieving 97.8% of DP optimal utility and handles larger problems where DP fails.
-
Lottery BP: Unlocking Quantum Error Decoding at Scale
Lottery BP adds randomness to belief propagation decoding and uses syndrome voting to achieve far higher accuracy on topological quantum codes while reducing reliance on expensive global decoders.
-
Open-Vocabulary Semantic Segmentation Network Integrating Object-Level Label and Scene-Level Semantic Features for Multimodal Remote Sensing Images
TSMNet uses a dual-branch text encoder and text-guided fusion module to integrate scene-level semantic and object-level label features from text with visual embeddings, achieving superior open-vocabulary segmentation on new multimodal remote sensing datasets.
-
Mistake gating leads to energy and memory efficient continual learning
Mistake-gated plasticity reduces neural network updates by 50-80% by gating changes on classification errors, improving efficiency for continual learning without added hyperparameters.
-
Extraction of linearized models from pre-trained networks via knowledge distillation
Koopman theory plus knowledge distillation yields linearized models from pre-trained nets that outperform standard least-squares Koopman approximations on MNIST and Fashion-MNIST in accuracy and stability.
-
Neural Networks With Dense Weights Are Not Universal Approximators
Dense ReLU networks under natural weight and dimension constraints fail to approximate certain Lipschitz functions, unlike unrestricted networks.
-
iPDB -- Optimizing Semantic SQL Queries
iPDB adds a predict operator and semantic query optimizations to SQL so that LLM and ML calls run efficiently inside the database, delivering 2.5x average and up to 30x speedup over prior systems.
-
Fusion or Confusion? Multimodal Complexity Is Not All You Need
Complex multimodal architectures do not reliably outperform unimodal baselines or a simple multimodal baseline under standardized evaluation.
-
HOLE: Homological Observation of Latent Embeddings for Neural Network Interpretability
HOLE applies persistent homology to latent embeddings in neural networks and uses visualizations such as cluster flow diagrams to reveal patterns of class separation, feature disentanglement, and robustness.
-
Pulse Shape Discrimination Algorithms: Survey and Benchmark
A survey and benchmark of ~60 PSD algorithms on two radiation datasets finds deep learning models (MLPs and hybrids) often outperform traditional statistical methods, with an open-source Python/MATLAB toolbox and datasets released.
-
Relating Simple Sentence Representations in Deep Neural Networks and the Brain
BERT activations show strongest correlation with MEG data for simple sentences; DNN representations generate synthetic brain data that improves stimuli decoding accuracy.
-
A Simulation Methodology Testbed for Typhoon Sensitivity Analysis: Framework Development and Perturbation-Response Experiments with the Pangu Weather Model
A MATLAB/ONNX testbed integrates the Pangu AI model with PID closed-loop control to perform single-input single-output perturbation-response experiments on typhoon track and intensity.
-
Classification of Single and Mixed Partial Discharges under Switching Voltage Using an AWA-CNN Framework
AWA patterns from PD pulse amplitude, width, and area enable CNNs to classify single and mixed partial discharge sources under switching voltage with over 96% test accuracy.
-
Generation of Heterogeneous PET Images from Uniform Organ Activity Maps Using a Pretrained Domain-Adapted Diffusion Model
A domain-adapted diffusion model synthesizes heterogeneous PET images from uniform organ activity maps, achieving high quantitative accuracy (CCC > 0.92) and visual realism comparable to real scans.
-
Foundation Models for Credit Risk Prediction: A Game Changer?
Tabular foundation models outperform standard methods in credit risk PD and LGD tasks, with larger gains on smaller datasets when used out-of-the-box.
-
Interpretable Neural Networks to Predict Momentum Fluxes of Orographic Gravity Waves
Neural networks predict orographic gravity wave momentum fluxes from coarse state variables with offline R² of 0.56-0.72, learn physically meaningful relationships via SHAP, and are compared to the Lott-Miller parameterization.
-
Machine Learning Enhanced Laser Spectroscopy for Multi-Species Gas Detection in Complex and Harsh Environments
Machine learning methods including denoising autoencoders, unsupervised interference mitigation, blind source separation, and certifiable classification are developed and experimentally validated to improve multi-species laser spectroscopy under complex conditions.
-
Predicting Associations between Solar Flares and Coronal Mass Ejections Using SDO/HMI Magnetograms and a Hybrid Neural Network
Hybrid neural network predicts eruptive versus confined solar flares from SDO/HMI magnetogram sequences, reports good performance, and links results to magnetic flux cancellation in polarity inversion lines.
-
Determination of Nanoparticle and Microdroplet Parameters in Levitating Microdroplets of Suspension by Speckle Image Analysis Using Convolutional Neural Networks
CNNs trained on speckle images from levitating TiO2 suspension microdroplets classify droplet diameter with better than 6% accuracy and provide useful discrimination for nanoparticle concentration and diameter, including simultaneous three-parameter classification.
-
Operator-Theoretic Energy Functionals for Impulse-Excited Nonstationary Signal Analysis
An operator-based Energy Concentration Index yields the IMRED detector that identifies defect-induced changes in impulse responses with AUC 0.908, outperforming standard Fourier and wavelet energy measures.
-
Vanishing Contributions: A Unified Framework for Smooth and Iterative Model Compression
VCON is a unified framework for smooth iterative DNN compression that uses parallel execution and an affine combination to progressively replace the original model with its compressed form during fine-tuning.
-
Bayesian Reasoning for Physics Informed Neural Networks
Introduces Laplace-approximated Bayesian PINNs for automatic loss-weight optimization when solving PDEs such as heat, wave, and Burgers equations.
-
General Inverse Design of Thin-Film Metamaterials With Convolutional Neural Networks
Convolutional neural networks are shown to perform inverse design of thin-film metamaterial stacks by learning the mapping from structure to ellipsometric and reflectance/transmittance spectra, with efficiency gains over traditional optimization as layer count increases.
-
Machine-Learning-Enhanced Non-Invasive Testing for MASLD Fibrosis: Shallow-Deep Neural Networks Versus FIB-4, Tabular Foundation Models, and Large Language Models
A 354-parameter shallow-deep neural network using age, AST, ALT, platelets and FIB-4 achieved external ROC-AUCs of 0.77 and 0.67 for advanced MASLD fibrosis, slightly above FIB-4's 0.75 and 0.60 on Malaysian and Indian cohorts.
-
A Systematic Survey on Deep Learning Architectures for Point Cloud Classification and Segmentation
A systematic literature survey that categorizes deep learning architectures for point cloud classification, part segmentation, and semantic segmentation, evaluates them on benchmarks, and discusses innovations, limitations, and future directions.
-
Joint sparse coding and temporal dynamics support context reconfiguration
Joint sparse coding and temporal dynamics in mPFC and computational networks reduce cross-context interference and enhance separability, enabling better retention in lifelong learning without extra heuristics.
-
Single-Cycle Multidirectional EOG Classification Faster than Human Reaction Time for Wearable Human-Computer Interactions
Cascaded neural networks classify 10 eye-movement classes from single-cycle EOG signals at 99% accuracy with sub-83 ms latency below human reaction time.
-
Using Deep Learning Models Pretrained by Self-Supervised Learning for Protein Localization
DINO-based ViT models pretrained on HPA FOV achieve macro F1 of 0.822 zero-shot and 0.860 after fine-tuning for protein localization on OpenCell, demonstrating effective transfer from SSL pretraining.
-
The ZTF-ULTRASAT experiment: Characterizing the non-transients in ULTRASAT's high cadence survey
ZTF high-cadence data shows RR Lyrae stars and flaring sources can mimic UV transients, with pre-existing ML catalogs offering a concrete mitigation approach.
-
Machine Learning-Based Cluster Classification to Suppress Background in a Prototype RPC Detector
Machine learning classifiers using fifteen cluster-level descriptors from time and ADC distributions effectively separate signal from background hits in prototype RPC detectors.
-
Perception Gaps in Risk, Benefit, and Value Between Experts and Public Challenge Socially Accepted AI
Experts rate AI scenarios as more likely, less risky, more beneficial, and more valuable than the public, applying different weightings to risk versus benefit.
-
Multiple Object Detection and Tracking in Panoramic Videos for Cycling Safety Analysis
A framework segments panoramic video into sub-images for detection, modifies multi-object tracking for boundary continuity, and applies it to vehicle overtaking detection in real cycling videos, reporting gains in precision and an F-score of 0.82.
-
A Proof-of-Concept Simulation-Driven Digital Twin Framework for Decision-Aware Diabetes Modeling
A simulation-driven digital twin framework is shown to generate interpretable diabetes trajectories for decision-aware analysis by combining benchmark data with controlled synthetic scenarios.
-
A Specialized Importance-Aware Quantum Convolutional Neural Network with Ring-Topology (IA-QCNN) for MGMT Promoter Methylation Prediction in Glioblastoma
IA-QCNN applies quantum principles via ring-topology convolution and importance weighting to achieve claimed high-accuracy MGMT methylation prediction from MRI with fewer parameters and noise robustness than classical models.
-
Supplementary Materials to Graph Convolutional Branch and Bound
Supplementary results on 1-tree relaxation performance inside a GCN-augmented branch-and-bound solver for TSP.
-
MiniGPT: Rebuilding GPT from First Principles
MiniGPT is a self-contained PyTorch implementation of standard GPT autoregressive modeling that reaches 1.478 validation loss on Tiny Shakespeare with a 10.77M-parameter model and produces recognizable Shakespeare-style text.
-
AI-Powered Surrogate Modelling for Multiscale Combustion: A Critical Review and Opportunities
A critical review of AI surrogate models for multiscale combustion that compares supervised, unsupervised, and physics-guided methods, identifies transferability and consistency challenges, and outlines future opportunities.
-
Enhancing Laser Surface Texturing through Advanced Machine Learning Techniques
Neural networks and random forests predict surface roughness from laser parameters and material data with high accuracy, speeding up optimization and reducing experimental effort.