A bilevel method learns composite pretraining loss weights online via gradient alignment with a downstream objective, matching tuned baselines at roughly 30% extra cost over one training run.
Jacobian descent for multi-objective optimization.arXiv preprint arXiv:2406.16232
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 3verdicts
UNVERDICTED 3roles
background 1polarities
background 1representative citing papers
APT augments multi-task learning by adapting advanced optimizers via momentum balancing and light direction preservation, delivering performance gains on four standard MTL datasets.
RI-CC2 simulations of pyrazine internal conversion match the experimental 22 fs decay time, identify Q9a and Q8a modes as drivers, and show the dark A1u state participates actively.
citing papers explorer
-
When Losses Align: Gradient-Based Composite Loss Weighting for Efficient Pretraining
A bilevel method learns composite pretraining loss weights online via gradient alignment with a downstream objective, matching tuned baselines at roughly 30% extra cost over one training run.
-
Delve into the Applicability of Advanced Optimizers for Multi-Task Learning
APT augments multi-task learning by adapting advanced optimizers via momentum balancing and light direction preservation, delivering performance gains on four standard MTL datasets.
-
Accessing the performance of CC2 for excited state dynamics: a benchmark study with pyrazine
RI-CC2 simulations of pyrazine internal conversion match the experimental 22 fs decay time, identify Q9a and Q8a modes as drivers, and show the dark A1u state participates actively.