{"paper":{"title":"Inverse-Hessian Regularization for Continual Learning in ASR","license":"http://creativecommons.org/licenses/by/4.0/","headline":"Inverse-Hessian Regularization adjusts post-fine-tuning ASR updates with prior-task curvature to limit forgetting while preserving adaptability.","cross_cats":[],"primary_cat":"eess.AS","authors_text":"Hugo Van hamme, Steven Vander Eeckt","submitted_at":"2026-01-21T08:10:26Z","abstract_excerpt":"Catastrophic forgetting remains a major challenge for continual learning (CL) in automatic speech recognition (ASR), where models must adapt to new domains without losing performance on previously learned conditions. Several CL methods have been proposed for ASR, and, recently, weight averaging - where models are averaged in a merging step after fine-tuning - has proven effective as a simple memory-free strategy. However, it is heuristic in nature and ignores the underlying loss landscapes of the tasks, hindering adaptability. In this work, we propose Inverse Hessian Regularization (IHR), a me"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"After fine-tuning on a new task, the adaptation is adjusted through a Kronecker-factored inverse Hessian approximation of the previous task, ensuring that the model moves primarily in directions less harmful to past performance, while keeping the method lightweight.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"The Kronecker-factored inverse Hessian approximation sufficiently captures the loss landscape curvature of previous ASR tasks to guide safe updates without harming adaptability.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"IHR incorporates curvature information via inverse Hessian approximation into the model merging step for continual learning in ASR, outperforming baselines by reducing forgetting while improving adaptability on two benchmarks.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"Inverse-Hessian Regularization adjusts post-fine-tuning ASR updates with prior-task curvature to limit forgetting while preserving adaptability.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"9d0f0d783c96410b54c587f82ca17213dbe71dc5365efa6a64e6c6235efcba09"},"source":{"id":"2601.14751","kind":"arxiv","version":1},"verdict":{"id":"eca37624-c08c-42b6-a6d1-784390d28b1f","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-16T12:37:01.098594Z","strongest_claim":"After fine-tuning on a new task, the adaptation is adjusted through a Kronecker-factored inverse Hessian approximation of the previous task, ensuring that the model moves primarily in directions less harmful to past performance, while keeping the method lightweight.","one_line_summary":"IHR incorporates curvature information via inverse Hessian approximation into the model merging step for continual learning in ASR, outperforming baselines by reducing forgetting while improving adaptability on two benchmarks.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"The Kronecker-factored inverse Hessian approximation sufficiently captures the loss landscape curvature of previous ASR tasks to guide safe updates without harming adaptability.","pith_extraction_headline":"Inverse-Hessian Regularization adjusts post-fine-tuning ASR updates with prior-task curvature to limit forgetting while preserving adaptability."},"references":{"count":36,"sample":[{"doi":"","year":null,"title":"To be accurate and inclusive, they must adapt to new speakers, accents, domains, or recording conditions","work_id":"ca3db169-9031-43f6-8e93-7f33192187a6","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"ASR Model We consider an encoder–decoder ASR model","work_id":"06a0e81b-2219-4914-8aa0-b79f71a05fe5","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2026,"title":"Inverse-Hessian Regularization for Continual Learning in ASR","work_id":"562a10a8-0262-4ec3-ae33-f770c1b447de","ref_index":3,"cited_arxiv_id":"2601.14751","is_internal_anchor":true},{"doi":"","year":2048,"title":"More information, including code and detailed results, can be found in our Github repository 1","work_id":"d0639223-b176-4752-917f-1d7666cec8dd","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"Experiment 1 As shown by Table 1, our method (IHR) significantly outperforms all baselines, being able to learn with close to zero forgetting (as shown by its -0.1 BWT)","work_id":"fc420377-c04d-4c50-9321-44fe9d0c63ec","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":36,"snapshot_sha256":"59c601a6868ad7ca174923dfe1e0b5213f137cb0df9354fca48ba8037744410f","internal_anchors":1},"formal_canon":{"evidence_count":1,"snapshot_sha256":"c4e47bd4eb9c3fe9bdd713734d2f5a1555b23026c848f36cd4513a4abf38fca4"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}