hub

Proceedings of the 26th annual international conference on machine learning , pages=

Curriculum learning , author=

13 Pith papers cite this work. Polarity classification is still indexing.

13 Pith papers citing it

browse 13 citing papers

hub tools

JSON dossier citing papers JSON

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Distributionally Robust Multi-Task Reinforcement Learning via Adaptive Task Sampling

cs.LG · 2026-05-14 · unverdicted · novelty 7.0

DRATS derives a minimax objective from a feasibility formulation of MTRL to adaptively sample tasks with the largest return gaps, leading to better worst-task performance on MetaWorld benchmarks.

The Benefits of Temporal Correlations: SGD Learns k-Juntas from Random Walks Efficiently

cs.LG · 2026-05-11 · unverdicted · novelty 7.0

Temporal correlations from lazy random walks enable efficient SGD learning of k-juntas via temporal-difference loss on ReLU networks, achieving linear sample complexity in d.

Near-optimal and Efficient First-Order Algorithm for Multi-Task Learning with Shared Linear Representation

cs.LG · 2026-05-01 · unverdicted · novelty 7.0

A new first-order algorithm for multi-task learning with shared linear representation achieves near-optimal error rates in constant iterations, improving existing methods by a factor of k.

StruMPL: Multi-task Dense Regression under Disjoint Partial Supervision and MNAR Labels

cs.CV · 2026-05-19 · unverdicted · novelty 6.0

StruMPL is a multi-task dense regression model that jointly addresses disjoint partial supervision, MNAR labels, and inter-task physical constraints for improved forest biomass estimation from Earth observation.

ST-TGExplainer: Disentangling Stability and Transition Patterns for Temporal GNN Interpretability

cs.LG · 2026-05-19 · unverdicted · novelty 6.0

ST-TGExplainer disentangles stability and transition patterns in temporal graphs via a self-explainable TGNN guided by a disentangled information bottleneck objective to produce more faithful explanations.

QuadLink: Autoregressive Quad-Dominant Mesh Generation via Point-Relation Learning

cs.GR · 2026-05-16 · unverdicted · novelty 6.0

QuadLink generates anisotropic quad-dominant meshes from point clouds via a hybrid centroid-conditioned vertex linking model and a Tri-to-Quad data conversion operator.

SwAIther-Precip: Lead-Time-Aware Bias Correction Enables Kilometer-Scale Downscaling of Global AI Precipitation Forecasts over Switzerland

physics.ao-ph · 2026-05-15 · unverdicted · novelty 6.0

SwAIther-Precip uses lead-time-conditioned U-Net bias correction followed by diffusion-based super-resolution to downscale AIFS forecasts, achieving 48% CRPS reduction and ~4 km effective resolution up to 5 days lead time.

Active Tabular Augmentation via Policy-Guided Diffusion Inpainting

cs.LG · 2026-05-11 · unverdicted · novelty 6.0

TAP couples a learner-conditioned policy with diffusion inpainting to generate and selectively inject high-utility tabular augmentations, yielding up to 15.6 pp accuracy gains and 32% RMSE reduction on seven datasets under severe scarcity.

Synthetic Pre-Pre-Training Improves Language Model Robustness to Noisy Pre-Training Data

cs.CL · 2026-05-11 · unverdicted · novelty 6.0

Synthetic pre-pre-training on structured data improves LLM robustness to noisy pre-training, matching baseline loss with up to 49% fewer natural tokens for a 1B model.

Heterogeneity-Aware Dataset Scheduling for Efficient Audio Large Language Model Training

cs.SD · 2026-05-18 · unverdicted · novelty 5.0

GST uses gradient-based affinity metrics to form dataset groups and applies progressive scheduling, achieving 30-40% faster convergence than uniform mixture training on 14 AudioQA datasets while matching or exceeding performance.

Temporal Aware Pruning for Efficient Diffusion-based Video Generation

cs.CV · 2026-05-18 · unverdicted · novelty 5.0 · 2 refs

TAPE applies temporal-aware token pruning with smoothing, reselection, and timestep scheduling to speed up video diffusion models while preserving visual fidelity and coherence.

ARGUS: Policy-Adaptive Ad Governance via Evolving Reinforcement with Adversarial Umpiring

cs.CL · 2026-05-04 · unverdicted · novelty 5.0

ARGUS uses a Prosecutor-Defender-Umpire multi-agent setup plus RAG and chain-of-thought rewards to adapt ad policy enforcement to new regulations using minimal fresh labels.

Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model

cs.CV · 2025-02-14 · unverdicted · novelty 4.0

Step-Video-T2V describes a 30B-parameter text-to-video model with custom Video-VAE, 3D DiT, flow matching, and Video-DPO that claims state-of-the-art results on a new internal benchmark.

citing papers explorer

Showing 13 of 13 citing papers.

Distributionally Robust Multi-Task Reinforcement Learning via Adaptive Task Sampling cs.LG · 2026-05-14 · unverdicted · none · ref 271
DRATS derives a minimax objective from a feasibility formulation of MTRL to adaptively sample tasks with the largest return gaps, leading to better worst-task performance on MetaWorld benchmarks.
The Benefits of Temporal Correlations: SGD Learns k-Juntas from Random Walks Efficiently cs.LG · 2026-05-11 · unverdicted · none · ref 67
Temporal correlations from lazy random walks enable efficient SGD learning of k-juntas via temporal-difference loss on ReLU networks, achieving linear sample complexity in d.
Near-optimal and Efficient First-Order Algorithm for Multi-Task Learning with Shared Linear Representation cs.LG · 2026-05-01 · unverdicted · none · ref 14
A new first-order algorithm for multi-task learning with shared linear representation achieves near-optimal error rates in constant iterations, improving existing methods by a factor of k.
StruMPL: Multi-task Dense Regression under Disjoint Partial Supervision and MNAR Labels cs.CV · 2026-05-19 · unverdicted · none · ref 21
StruMPL is a multi-task dense regression model that jointly addresses disjoint partial supervision, MNAR labels, and inter-task physical constraints for improved forest biomass estimation from Earth observation.
ST-TGExplainer: Disentangling Stability and Transition Patterns for Temporal GNN Interpretability cs.LG · 2026-05-19 · unverdicted · none · ref 199
ST-TGExplainer disentangles stability and transition patterns in temporal graphs via a self-explainable TGNN guided by a disentangled information bottleneck objective to produce more faithful explanations.
QuadLink: Autoregressive Quad-Dominant Mesh Generation via Point-Relation Learning cs.GR · 2026-05-16 · unverdicted · none · ref 164
QuadLink generates anisotropic quad-dominant meshes from point clouds via a hybrid centroid-conditioned vertex linking model and a Tri-to-Quad data conversion operator.
SwAIther-Precip: Lead-Time-Aware Bias Correction Enables Kilometer-Scale Downscaling of Global AI Precipitation Forecasts over Switzerland physics.ao-ph · 2026-05-15 · unverdicted · none · ref 15
SwAIther-Precip uses lead-time-conditioned U-Net bias correction followed by diffusion-based super-resolution to downscale AIFS forecasts, achieving 48% CRPS reduction and ~4 km effective resolution up to 5 days lead time.
Active Tabular Augmentation via Policy-Guided Diffusion Inpainting cs.LG · 2026-05-11 · unverdicted · none · ref 55
TAP couples a learner-conditioned policy with diffusion inpainting to generate and selectively inject high-utility tabular augmentations, yielding up to 15.6 pp accuracy gains and 32% RMSE reduction on seven datasets under severe scarcity.
Synthetic Pre-Pre-Training Improves Language Model Robustness to Noisy Pre-Training Data cs.CL · 2026-05-11 · unverdicted · none · ref 11
Synthetic pre-pre-training on structured data improves LLM robustness to noisy pre-training, matching baseline loss with up to 49% fewer natural tokens for a 1B model.
Heterogeneity-Aware Dataset Scheduling for Efficient Audio Large Language Model Training cs.SD · 2026-05-18 · unverdicted · none · ref 28
GST uses gradient-based affinity metrics to form dataset groups and applies progressive scheduling, achieving 30-40% faster convergence than uniform mixture training on 14 AudioQA datasets while matching or exceeding performance.
Temporal Aware Pruning for Efficient Diffusion-based Video Generation cs.CV · 2026-05-18 · unverdicted · none · ref 113 · 2 links
TAPE applies temporal-aware token pruning with smoothing, reselection, and timestep scheduling to speed up video diffusion models while preserving visual fidelity and coherence.
ARGUS: Policy-Adaptive Ad Governance via Evolving Reinforcement with Adversarial Umpiring cs.CL · 2026-05-04 · unverdicted · none · ref 67
ARGUS uses a Prosecutor-Defender-Umpire multi-agent setup plus RAG and chain-of-thought rewards to adapt ad policy enforcement to new regulations using minimal fresh labels.
Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model cs.CV · 2025-02-14 · unverdicted · none · ref 167
Step-Video-T2V describes a 30B-parameter text-to-video model with custom Video-VAE, 3D DiT, flow matching, and Video-DPO that claims state-of-the-art results on a new internal benchmark.

Proceedings of the 26th annual international conference on machine learning , pages=

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer