Training-free Task Classification for Multi-Task Model Merging

Jinwook Jung; Jungyong Son; Sungyong Baik

arxiv: 2606.22589 · v1 · pith:67AHLDEPnew · submitted 2026-06-21 · 💻 cs.LG

Training-free Task Classification for Multi-Task Model Merging

Jungyong Son , Jinwook Jung , Sungyong Baik This is my paper

Pith reviewed 2026-06-26 10:45 UTC · model grok-4.3

classification 💻 cs.LG

keywords model mergingtask routingtraining-free classificationSVD low-rank manifoldsprojection residualdynamic mergingmulti-task modelstask-unknown inference

0 comments

The pith

SiM classifies tasks via projection residuals onto SVD low-rank manifolds to route merged models without router training or task IDs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper seeks to merge multiple task-specific experts into one model that performs nearly as well as the individual experts even when the task identity of a test input is unknown. It reframes the routing step as a training-free classification problem solved by measuring how well each input projects onto a low-rank manifold built per task from SVD. A reader would care because earlier dynamic merging methods either needed extra labeled data to train routers or assumed task IDs were supplied at inference. If the approach holds, merged models become usable in settings where task labels are unavailable and no additional training is allowed. The manifolds themselves are built offline from a small support set of examples per task before any merging occurs.

Core claim

Task classification for routing can be performed without any router training by scoring each test input according to its projection residual onto SVD-derived low-rank manifolds computed separately for each task from a small support set; the lowest-residual task determines the routing choice, and the same manifolds integrate directly with subspace- or mask-based merging that stores only compressed task vectors.

What carries the argument

SVD-based low-rank manifold approximations for each task, scored by projection residual of the test input feature.

If this is right

Merged-model accuracy rises under task-unknown inference and narrows the gap to separate task experts.
No router training and no task-ID access are required at inference.
The method works with compressed task-vector representations that avoid storing full expert weights.
The same offline manifold computation applies across both vision and language benchmarks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same residual-scoring idea could be tested on merging more than a handful of tasks without retraining any component.
If the low-rank structure generalizes, support-set size could be reduced further while preserving separation quality.
The approach suggests that task discrimination may not need learned routers when feature geometry already encodes task identity.

Load-bearing premise

SVD low-rank approximations from a small per-task support set produce manifolds whose projection residuals reliably distinguish tasks at inference time.

What would settle it

On a new collection of tasks, the task whose manifold yields the smallest residual for a given input matches the true task no more often than random selection, or the routed merged model shows no improvement over a non-routed baseline.

Figures

Figures reproduced from arXiv: 2606.22589 by Jinwook Jung, Jungyong Son, Sungyong Baik.

**Figure 2.** Figure 2: Overview of SiM. The figure provides a detailed illustration of the SiM framework, which outlines the internal processes within the gray dashed border box shown in [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Task-specific feature distributions from the CLIP ViT-B/32 across 8 vision tasks [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 5.** Figure 5: Manifold projection residual vs. Euclidean distance. Two features z1 and z2 have the same Euclidean distance to the task mean µ, but their projection residuals differ depending on their alignment with the task manifold. Cars DTD EuroSAT GTSRB MNIST RESISC45 SUN397 SVHN Task Manifold Cars DTD EuroSAT GTSRB MNIST RESISC45 SUN397 SVHN Input Task 0.50 0.79 0.72 0.74 0.72 0.78 0.80 0.75 0.79 0.57 0.67 0.74 0.… view at source ↗

**Figure 7.** Figure 7: Task classification performance of SiM vs Euclidean on the CLIP ViT-B/32 across 8, 14, and 20 tasks 4 8 16 32 64 128 256 512 N Number of Samples 85.0 87.5 90.0 92.5 95.0 97.5 100.0 Acc Task 8 Task 14 Task 20 [PITH_FULL_IMAGE:figures/full_fig_p014_7.png] view at source ↗

read the original abstract

Ever since the advent of foundation models and the pre-training-finetuning paradigm, there have been numerous efforts to merge multiple task-specific experts into a single multi-task model. Prior work largely focuses on finding a single merged model, but it often underperforms individual experts due to parameter interference. To resolve this, dynamic model merging employs routing to activate task-relevant parameters per input. However, existing routers typically require either additional training with abundant labeled datasets or assume the access to task IDs of each input at inference time. In this work, we aim to close the gap to expert performance without additional training or task-ID-access assumption. To this end, we formulate routing as training-free task classification for each test input. Using singular value decomposition (SVD)-based low-rank manifold approximations for each task, SiM scores tasks by the projection residual of the test input feature onto each task manifold and routes accordingly. The task manifolds are pre-computable offline from a pretrained backbone using a small per-task support set (e.g., 32 examples per task) prior to merging process, requiring no router training and no data during the merging process. Moreover, SiM integrates seamlessly with subspace-/mask-based merging that represents task-expert via lightweight compressed task vectors, avoiding the need to store full expert parameters. Experiments across computer vision and natural language processing benchmarks under task-unknown inference demonstrate that SiM substantially improves merged-model performance and consistently narrows the gap to individual task experts.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SiM gives a clean training-free router for dynamic merging by scoring projection residuals onto SVD manifolds from small support sets, but the stability of those manifolds is unverified.

read the letter

The main takeaway is that this paper removes the need for either router training or task IDs by turning routing into a simple nearest-manifold lookup. It builds low-rank approximations per task from 32-example support sets via SVD, then routes each input to the task whose manifold gives the smallest residual norm. That formulation is new relative to the dynamic merging papers cited, which either train routers or assume task labels at test time.

The work does a few things cleanly. It stays compatible with existing subspace and mask-based merging techniques that store only compressed task vectors, so no full expert parameters need to be kept around. The abstract also claims consistent gains on both vision and language benchmarks under fully task-unknown inference, narrowing the gap to single-task experts.

The soft spot is exactly the one flagged in the stress test. The method assumes that SVD column spaces estimated from 32 examples will be stable across draws and will separate the task distributions in feature space. The abstract gives no variance numbers over different support-set samples, no scaling curves with support-set size, and no measurement of angles between task subspaces. If those subspaces overlap or shift with sampling, the residual scores become unreliable. Without those checks, the reported improvements could be sensitive to the particular support sets or SVD ranks chosen.

This paper is for groups already working on parameter-efficient multi-task inference who want to drop the router-training step. A reader who needs a practical baseline for training-free routing will find the idea worth testing. It deserves a serious referee because the problem is real and the proposed fix is lightweight enough to evaluate quickly, even though the current evidence leaves the core assumption open.

Referee Report

3 major / 2 minor

Summary. The paper introduces SiM, a training-free task classification method for routing in multi-task model merging. It approximates each task's feature manifold via SVD on a small per-task support set (e.g., 32 examples), then routes each test input by the projection residual onto these manifolds, claiming this narrows the gap to individual task experts on CV and NLP benchmarks under task-unknown inference while integrating with subspace/mask-based merging and requiring no router training or task IDs.

Significance. If the reported gains hold after controlling for support-set sampling and SVD rank, the result would be significant for practical multi-task merging: it removes the usual costs of trained routers and task-ID assumptions while preserving the benefits of dynamic parameter activation. The offline pre-computation of manifolds from pretrained backbones and compatibility with compressed task vectors are practical strengths; the benchmark improvements, if robust, would demonstrate a viable path to closing the expert gap without additional labeled data.

major comments (3)

[§3.2] §3.2 (task manifold construction and routing rule): the central claim that projection residuals onto SVD low-rank approximations from 32-example support sets reliably distinguish tasks is load-bearing, yet the manuscript provides no variance analysis across multiple draws of the support set, no scaling curves versus support-set size, and no quantification of inter-task subspace angles or overlap; without these, it is unclear whether the argmin residual decision is stable or discriminative when tasks share feature directions.
[§4] §4 (experimental protocol): all reported results fix the support-set size at 32 and SVD rank without ablation or sensitivity analysis; this leaves open whether the claimed narrowing of the gap to task experts is an artifact of particular hyperparameter choices rather than a general property of the residual-based classifier.
[Table 2 / Figure 3] Table 2 / Figure 3 (merged-model accuracy under task-unknown inference): the gains are presented without error bars over support-set resampling or backbone feature extractor variation, making it impossible to assess whether the improvements are statistically reliable or sensitive to the weakest assumption that 32-example manifolds separate task distributions.

minor comments (2)

[§3.2] Notation for the projection residual ||(I - U_k U_k^T) f(x)|| is introduced without an explicit equation number; adding one would improve traceability when comparing to related subspace methods.
[§1] The abstract and §1 state that manifolds are “pre-computable offline … requiring no data during the merging process,” but the dependence on a support set (even if small) should be clarified in the introduction to avoid implying zero data requirements.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback highlighting the need for robustness analyses. We will revise the manuscript to incorporate variance studies, ablations, and error bars as detailed below.

read point-by-point responses

Referee: [§3.2] §3.2 (task manifold construction and routing rule): the central claim that projection residuals onto SVD low-rank approximations from 32-example support sets reliably distinguish tasks is load-bearing, yet the manuscript provides no variance analysis across multiple draws of the support set, no scaling curves versus support-set size, and no quantification of inter-task subspace angles or overlap; without these, it is unclear whether the argmin residual decision is stable or discriminative when tasks share feature directions.

Authors: We agree these analyses would strengthen the claims regarding stability. In revision we will add: (i) performance variance over 5 random support-set draws per task, (ii) scaling curves for support-set sizes 8–128, and (iii) average principal angles between task SVD bases to quantify overlap. These will be reported in a new subsection of §3.2. revision: yes
Referee: [§4] §4 (experimental protocol): all reported results fix the support-set size at 32 and SVD rank without ablation or sensitivity analysis; this leaves open whether the claimed narrowing of the gap to task experts is an artifact of particular hyperparameter choices rather than a general property of the residual-based classifier.

Authors: We will include sensitivity ablations varying support-set size (8, 16, 32, 64, 128) and SVD rank (1–16) on the main benchmarks, showing that the gap-narrowing effect holds across reasonable ranges of these hyperparameters. revision: yes
Referee: [Table 2 / Figure 3] Table 2 / Figure 3 (merged-model accuracy under task-unknown inference): the gains are presented without error bars over support-set resampling or backbone feature extractor variation, making it impossible to assess whether the improvements are statistically reliable or sensitive to the weakest assumption that 32-example manifolds separate task distributions.

Authors: We will recompute Table 2 and Figure 3 over multiple support-set resamplings and report mean ± std error bars. A short paragraph will also discuss sensitivity to the backbone feature extractor used for manifold construction. revision: yes

Circularity Check

0 steps flagged

No circularity: explicit SVD-based residual scoring from support sets

full rationale

The paper defines SiM directly as computing per-task low-rank manifolds via SVD on a small support set (e.g., 32 examples) and routing test inputs by explicit projection residuals ||(I - U_k U_k^T) f(x)||. This construction is the method itself; no parameter is fitted on one subset and then invoked as a 'prediction' of a related quantity, no self-citation chain justifies a uniqueness claim or ansatz, and no known result is merely renamed. The derivation chain consists of the stated algorithmic steps with no reduction to inputs by construction. The approach is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 1 invented entities

The central claim rests on the mathematical property that SVD yields the optimal low-rank approximation and on the modeling choice that small support sets suffice to capture task-specific feature distributions.

free parameters (2)

support set size
Chosen as 32 examples per task in the abstract; functions as a hyperparameter balancing computation and manifold quality.
SVD approximation rank
Not specified in the abstract but required to define the low-rank manifold; acts as a free parameter.

axioms (1)

standard math Singular value decomposition produces the optimal low-rank approximation in the Frobenius norm
Invoked to justify the manifold construction step.

invented entities (1)

task manifold no independent evidence
purpose: Geometric summary of task-specific features used to compute projection residuals for routing
New construct introduced to enable training-free classification; no independent evidence outside the method itself.

pith-pipeline@v0.9.1-grok · 5796 in / 1404 out tokens · 23797 ms · 2026-06-26T10:45:49.483819+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Learning to Recover Task Experts from a Multi-Task Merged Model
cs.AI 2026-06 unverdicted novelty 6.0

ReTeX predicts additive offsets to undo merging interference and uses an SVD subspace signature task identifier to recover over 95% of expert performance while improving generalization to unseen tasks.

Reference graph

Works this paper leans on

87 extracted references · 5 linked inside Pith · cited by 1 Pith paper

[1]

arXiv preprint arXiv:2303.08774 (2023)

Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023)

Pith/arXiv arXiv 2023
[2]

In: ICCCSP (2021)

Ahmed, M.I., Mamun, S.M., Asif, A.U.Z.: Dcnn-based vegetable image classifica- tion using transfer learning: A comparative study. In: ICCCSP (2021)

2021
[3]

Data in Brief (2023)

Ahmed, S.I., Ibrahim, M., Nadim, M., Rahman, M.M., Shejunti, M.M., Jabid, T., Ali, M.S.: Mangoleafbd: A comprehensive image dataset to classify diseased and healthy mango leaves. Data in Brief (2023)

2023
[4]

Alessio, C.: Animals-10.https://www.kaggle.com/datasets/alessiocorrado99/ animals10
[5]

In: NeurIPS (2019)

Ansuini, A., Laio, A., Macke, J.H., Zoccolan, D.: Intrinsic dimension of data rep- resentations in deep neural networks. In: NeurIPS (2019)

2019
[6]

Available on https://www

Bansal, P.: Intel image classification. Available on https://www. kaggle. com/puneet6060/intel-image-classification, Online (2019)

2019
[7]

In: ECCV (2014)

Bossard, L., Guillaumin, M., Gool, L.V.: Food-101 – mining discriminative com- ponents with random forests. In: ECCV (2014)

2014
[8]

In: NeurIPS (2020)

Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Nee- lakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D.M., Wu, J., Win- ter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., McCandlish, S., Radford...

2020
[9]

CCHANG: Garbage classification.https://www.kaggle.com/ds/81794(2018)

2018
[10]

Proceedings of the IEEE (2017)

Cheng, G., Han, J., Lu, X.: Remote sensing image scene classification: Benchmark and state of the art. Proceedings of the IEEE (2017)

2017
[11]

In: CVPR (2020)

Choi, Y., Uh, Y., Yoo, J., Ha, J.W.: Stargan v2: Diverse image synthesis for mul- tiple domains. In: CVPR (2020)

2020
[12]

In: CVPR (2014)

Cimpoi, M., Maji, S., Kokkinos, I., Mohamed, S., Vedaldi, A.: Describing textures in the wild. In: CVPR (2014)

2014
[13]

arXiv preprint arXiv:1812.01718 (2018)

Clanuwat, T., Bober-Irizar, M., Kitamoto, A., Lamb, A., Yamamoto, K., Ha, D.: Deep learning for classical japanese literature. arXiv preprint arXiv:1812.01718 (2018)

Pith/arXiv arXiv 2018
[14]

In: AISTATS (2011)

Coates, A., Ng, A., Lee, H.: An analysis of single-layer networks in unsupervised feature learning. In: AISTATS (2011)

2011
[15]

In: IJCNN (2017)

Cohen, G., Afshar, S., Tapson, J., Schaik, A.v.: Emnist: Extending mnist to hand- written letters. In: IJCNN (2017)

2017
[16]

Journal of Diabetes Science and Technology (2009) 16 J

Cuadros, J., Bresnick, G.: Eyepacs: An adaptable telemedicine system for diabetic retinopathy screening. Journal of Diabetes Science and Technology (2009) 16 J. Son et al

2009
[17]

cats.https://kaggle.com/competitions/dogs-vs-cats (2013)

Cukierski, W.: Dogs vs. cats.https://kaggle.com/competitions/dogs-vs-cats (2013)

2013
[18]

DeepNets: Landscape recognition.https : / / www . kaggle . com / datasets / utkarshsaxenadn/landscape-recognition-image-dataset-12k-images
[19]

IEEE Signal Processing Magazine (SPM) (2012)

Deng, L.: The mnist database of handwritten digit images for machine learning research [best of the web]. IEEE Signal Processing Magazine (SPM) (2012)

2012
[20]

In: ICLR (2021)

Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., Houlsby, N.: An image is worth 16x16 words: Transformers for image recognition at scale. In: ICLR (2021)

2021
[21]

Computer Vision and Image Understanding (2007)

Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object cate- gories. Computer Vision and Image Understanding (2007)

2007
[22]

In: NeurIPS (2022)

Feng, R., Zheng, K., Huang, Y., Zhao, D., Jordan, M., Zha, Z.J.: Rank diminishing in deep neural networks. In: NeurIPS (2022)

2022
[23]

arXiv preprint arXiv:2412.00081 (2024)

Gargiulo, A.A., Crisostomi, D., Bucarelli, M.S., Scardapane, S., Silvestri, F., Rodolà, E.: Task singular vectors: Reducing task interference in model merging. arXiv preprint arXiv:2412.00081 (2024)

arXiv 2024
[24]

In: ICONIP (2013)

Goodfellow, I.J., Erhan, D., Carrier, P.L., Courville, A., Mirza, M., Hamner, B., Cukierski, W., Tang, Y., Thaler, D., Lee, D.H., Zhou, Y., Ramaiah, C., Feng, F., Li, R., Wang, X., Athanasakis, D., Shawe-Taylor, J., Milakov, M., Park, J., Ionescu, R., Popescu, M., Grozea, C., Bergstra, J., Xie, J., Romaszko, L., Xu, B., Chuang, Z., Bengio, Y.: Challenges ...

2013
[25]

Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset. Tech. rep., California Institute of Technology (2007)

2007
[26]

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (2019)

Helber, P., Bischke, B., Dengel, A., Borth, D.: Eurosat: A novel dataset and deep learning benchmark for land use and land cover classification. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (2019)

2019
[27]

In: CVPR (2015)

Horn, G.V., Branson, S., Farrell, R., Haber, S., Barry, J., Ipeirotis, P., Perona, P., Belongie, S.: Building a bird recognition app and large scale dataset with citizen scientists: The fine print in fine-grained dataset collection. In: CVPR (2015)

2015
[28]

Howard, J.: Imagenette: A smaller subset of 10 easily classified classes from ima- genet.https://github.com/fastai/imagenette(2019)

2019
[29]

In: NeurIPS (2024)

Huang, C., Ye, P., Chen, T., He, T., Yue, X., Ouyang, W.: Emr-merging: Tuning- free high-performance model merging. In: NeurIPS (2024)

2024
[30]

In: ICLR (2023)

Ilharco, G., Tulio Ribeiro, M., Wortsman, M., Gururangan, S., Schmidt, L., Ha- jishirzi, H., Farhadi, A.: Editing models with task arithmetic. In: ICLR (2023)

2023
[31]

In: ICLR (2025)

Iurada, L., Ciccone, M., Tommasi, T.: Efficient model editing with task-localized sparse fine-tuning. In: ICLR (2025)

2025
[32]

In: ICLR (2023)

Jin, X., Ren, X., Preotiuc-Pietro, D., Cheng, P.: Dataless knowledge fusion by merging weights of language models. In: ICLR (2023)

2023
[33]

Cell (2018)

Kermany, D.S., Goldbaum, M., Cai, W., Valentim, C.C.S., Liang, H., Baxter, S.L., McKeown, A., Yang, G., Wu, X., Yan, F., Dong, J., Prasadha, M., Pei, J., Ting, M., Zhu, J., Li, C., Hewett, S., Dong, J., Ziyar, I., Shi, A., Zhang, R., Gupta, K., Wong, R.M.Y., Lam, L.A.T.C., Cheung, J., Tsoi, M., Wu, V., Yan, C., Huang, C., Lee, D.H., Zhang, Y., Wong, J.W.,...

2018
[34]

In: CVPR (2011) Training-free Task Classification for Multi-Task Model Merging 17

Khosla, A., Jayadevaprakash, N., Yao, B., Fei-Fei, L.: Novel dataset for fine-grained image categorization. In: CVPR (2011) Training-free Task Classification for Multi-Task Model Merging 17

2011
[35]

In: AAAI (2020)

Khot, T., Clark, P., Guerquin, M., Jansen, P., Sabharwal, A.: Qasc: A dataset for question answering via sentence composition. In: AAAI (2020)

2020
[36]

In: ICCV Workshop (2013)

Krause, J., Stark, M., Deng, J., Fei-Fei, L.: 3d object representations for fine- grained categorization. In: ICCV Workshop (2013)

2013
[37]

Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)

2009
[38]

In: ECCV (2012)

Kumar, N., Belhumeur, P.N., Biswas, A., Jacobs, D.W., Kress, W.J., Lopez, I.C., Soares, J.V.B.: Leafsnap: A computer vision system for automatic plant species identification. In: ECCV (2012)

2012
[39]

Lab, M.A.: Bean disease dataset (2020),https://github.com/AI-Lab-Makerere/ ibean

2020
[40]

IEEE Transactions on Pattern Analysis and Machine Intelligence (2005)

Lee, K.C., Ho, J., Kriegman, D.J.: Acquiring linear subspaces for face recogni- tion under variable lighting. IEEE Transactions on Pattern Analysis and Machine Intelligence (2005)

2005
[41]

In: ICLR (2025)

Lee, Y., Jung, J., Baik, S.: Mitigating parameter interference in model merging via sharpness-aware fine-tuning. In: ICLR (2025)

2025
[42]

In: KR (2012)

Levesque, H., Davis, E., Morgenstern, L.: The winograd schema challenge. In: KR (2012)

2012
[43]

In: CVPR (2017)

Li, S., Deng, W., Du, J.: Reliable crowdsourcing and deep locality-preserving learn- ing for expression recognition in the wild. In: CVPR (2017)

2017
[44]

arXiv preprint arXiv:2206.11404 (2022)

Liao, P., Li, X., Liu, X., Keutzer, K.: The artbench dataset: Benchmarking gener- ative models with artworks. arXiv preprint arXiv:2206.11404 (2022)

arXiv 2022
[45]

Liu, Z., He, K.: A decade’s battle on dataset bias: Are we there yet? In: ICLR (2025)

2025
[46]

In: NeurIPS (2024)

Lu, Z., Fan, C., Wei, W., Qu, X., Chen, D., Cheng, Y.: Twin-merging: Dynamic integration of modular expertise in model merging. In: NeurIPS (2024)

2024
[47]

arXiv preprint arXiv:1306.5151 (2013)

Maji, S., Rahtu, E., Kannala, J., Blaschko, M., Vedaldi, A.: Fine-grained visual classification of aircraft. arXiv preprint arXiv:1306.5151 (2013)

Pith/arXiv arXiv 2013
[48]

In: ICML (2025)

Marczak, D., Magistri, S., Cygert, S., Twardowski, B., Bagdanov, A.D., van de Weijer, J.: No task left behind: Isotropic model merging with common and task- specific subspaces. In: ICML (2025)

2025
[49]

In: NeurIPS (2022)

Matena, M., Raffel, C.: Merging models with fisher-weighted averaging. In: NeurIPS (2022)

2022
[50]

Frontiers in Plant Science (2016)

Mohanty, S.P., Hughes, D.P., Salathé, M.: Using deep learning for image-based plant disease detection. Frontiers in Plant Science (2016)

2016
[51]

Acta Universitatis Sapientiae, Informatica (2018)

Muresan, H., Oltean, M.: Fruit recognition from images using deep learning. Acta Universitatis Sapientiae, Informatica (2018)

2018
[52]

Nagaraj, A.: Asl alphabet.https://www.kaggle.com/datasets/grassknoted/ asl-alphabet
[53]

In: NIPS Workshop (2011)

Netzer, Y., Coates, A., Wu, B., Wang, T., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning. In: NIPS Workshop (2011)

2011
[54]

In: ICVGIP (2008)

Nilsback, M.E., Zisserman, A.: Automated flower classification over a large number of classes. In: ICVGIP (2008)

2008
[55]

In: ICLR (2025)

Oh, C., Li, Y., Song, K., Yun, S., Han, D.: Dawin: Training-free dynamic weight interpolation for robust adaptation. In: ICLR (2025)

2025
[56]

Sci- entific Reports (2019)

Olsen, A., Konovalov, D.A., Philippa, B., Ridd, P., Wood, J.C., Johns, J., Banks, W., Girgenti, B., Kenny, O., Whinney, J., Calvert, B., Rahimi Azghadi, M., White, R.D.: Deepweeds: A multiclass weed species image dataset for deep learning. Sci- entific Reports (2019)

2019
[57]

In: NeurIPS (2023) 18 J

Ortiz-Jimenez, G., Favero, A., Frossard, P.: Task arithmetic in the tangent space: Improved editing of pre-trained models. In: NeurIPS (2023) 18 J. Son et al

2023
[58]

In: CVPR (2012)

Parkhi, O.M., Vedaldi, A., Zisserman, A., Jawahar, C.V.: Cats and dogs. In: CVPR (2012)

2012
[59]

In: ACM MMSys (2017)

Pogorelov, K., Randel, K.R., Griwodz, C., Eskeland, S.L., de Lange, T., Johansen, D., Spampinato, C., Dang-Nguyen, D.T., Lux, M., Schmidt, P.T., et al.: Kvasir: A multi-class image dataset for computer aided gastrointestinal disease detection. In: ACM MMSys (2017)

2017
[60]

In: CVPR (2009)

Quattoni, A., Torralba, A.: Recognizing indoor scenes. In: CVPR (2009)

2009
[61]

In: ICML (2021)

Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., Sutskever, I.: Learning transferable visual models from natural language supervision. In: ICML (2021)

2021
[62]

Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019)

2019
[63]

In: JMLR (2020)

Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.: Exploring the limits of transfer learning with a unified text-to-text transformer. In: JMLR (2020)

2020
[64]

In: AAAI (2020)

Sakaguchi, K., Bras, R.L., Bhagavatula, C., Choi, Y.: Winogrande: An adversarial winograd schema challenge at scale. In: AAAI (2020)

2020
[65]

In: ACL (2018)

Sharma, R., Allen, J., Bakhshandeh, O., Mostafazadeh, N.: Tackling the story ending biases in the story cloze test. In: ACL (2018)

2018
[66]

arXiv preprint arXiv:2506.16506 (2025)

Skorobogat, R., Roth, K., Georgescu, M.I.: Subspace-boosted model merging. arXiv preprint arXiv:2506.16506 (2025)

arXiv 2025
[67]

In: EMNLP (2013)

Socher, R., Perelygin, A., Wu, J., Chuang, J., Manning, C.D., Ng, A., Potts, C.: Recursive deep models for semantic compositionality over a sentiment treebank. In: EMNLP (2013)

2013
[68]

In: IJCNN (2011)

Stallkamp, J., Schlipsing, M., Salmen, J., Igel, C.: The german traffic sign recog- nition benchmark: a multi-class classification competition. In: IJCNN (2011)

2011
[69]

In: EMNLP (2019)

Tafjord, O., Gardner, M., Lin, K., Clark, P.: Quartz: An open-domain dataset of qualitative relationship questions. In: EMNLP (2019)

2019
[70]

In: ICML (2024)

Tang, A., Shen, L., Luo, Y., Yin, N., Zhang, L., Tao, D.: Merging multi-task models via weight-ensembling mixture of experts. In: ICML (2024)

2024
[71]

In: CVPR (2011)

Torralba, A., Efros, A.A.: Unbiased look at dataset bias. In: CVPR (2011)

2011
[72]

arXiv preprint arXiv:2302.13971 (2023)

Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., Rodriguez, A., Joulin, A., Grave, E., Lample, G.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023)

Pith/arXiv arXiv 2023
[73]

In: MICCAI (2018)

Veeling, B.S., Linmans, J., Winkens, J., Cohen, T., Welling, M.: Rotation equiv- ariant cnns for digital pathology. In: MICCAI (2018)

2018
[74]

Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The caltech-ucsd birds-200-2011 dataset. Tech. rep., California Institute of Technology (2011)

2011
[75]

In: ICML (2024)

Wang, K., Dimitriadis, N., Ortiz-Jimenez, G., Fleuret, F., Frossard, P.: Localizing task information for improved model merging and compression. In: ICML (2024)

2024
[76]

In: ICLR (2022)

Wei, J., Bosma, M., Zhao, V.Y., Guu, K., Yu, A.W., Lester, B., Du, N., Dai, A.M., Le, Q.V.: Finetuned language models are zero-shot learners. In: ICLR (2022)

2022
[77]

In: ICML (2025)

Wei, Y., Tang, A., Shen, L., Hu, Z., Yuan, C., Cao, X.: Modeling multi-task model merging as adaptive projective gradient descent. In: ICML (2025)

2025
[78]

In: CVPR (2022)

Wortsman, M., Ilharco, G., Kim, J.W., Li, M., Kornblith, S., Roelofs, R., Gontijo- Lopes, R., Hajishirzi, H., Farhadi, A., Namkoong, H., Schmidt, L.: Robust fine- tuning of zero-shot models. In: CVPR (2022)

2022
[79]

IEEE Transactions on Geoscience and Remote Sensing (2017) Training-free Task Classification for Multi-Task Model Merging 19

Xia, G.S., Hu, J., Hu, F., Shi, B., Bai, X., Zhong, Y., Zhang, L., Lu, X.: Aid: A benchmark data set for performance evaluation of aerial scene classification. IEEE Transactions on Geoscience and Remote Sensing (2017) Training-free Task Classification for Multi-Task Model Merging 19

2017
[80]

Xiao, H., Zhang, F., Shen, Z., Wu, K., Zhang, J.: Classification of weather phe- nomenonfromimagesbyusingdeepconvolutionalneuralnetwork.EarthandSpace Science (2021)

2021

Showing first 80 references.

[1] [1]

arXiv preprint arXiv:2303.08774 (2023)

Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023)

Pith/arXiv arXiv 2023

[2] [2]

In: ICCCSP (2021)

Ahmed, M.I., Mamun, S.M., Asif, A.U.Z.: Dcnn-based vegetable image classifica- tion using transfer learning: A comparative study. In: ICCCSP (2021)

2021

[3] [3]

Data in Brief (2023)

Ahmed, S.I., Ibrahim, M., Nadim, M., Rahman, M.M., Shejunti, M.M., Jabid, T., Ali, M.S.: Mangoleafbd: A comprehensive image dataset to classify diseased and healthy mango leaves. Data in Brief (2023)

2023

[4] [4]

Alessio, C.: Animals-10.https://www.kaggle.com/datasets/alessiocorrado99/ animals10

[5] [5]

In: NeurIPS (2019)

Ansuini, A., Laio, A., Macke, J.H., Zoccolan, D.: Intrinsic dimension of data rep- resentations in deep neural networks. In: NeurIPS (2019)

2019

[6] [6]

Available on https://www

Bansal, P.: Intel image classification. Available on https://www. kaggle. com/puneet6060/intel-image-classification, Online (2019)

2019

[7] [7]

In: ECCV (2014)

Bossard, L., Guillaumin, M., Gool, L.V.: Food-101 – mining discriminative com- ponents with random forests. In: ECCV (2014)

2014

[8] [8]

In: NeurIPS (2020)

Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Nee- lakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D.M., Wu, J., Win- ter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., McCandlish, S., Radford...

2020

[9] [9]

CCHANG: Garbage classification.https://www.kaggle.com/ds/81794(2018)

2018

[10] [10]

Proceedings of the IEEE (2017)

Cheng, G., Han, J., Lu, X.: Remote sensing image scene classification: Benchmark and state of the art. Proceedings of the IEEE (2017)

2017

[11] [11]

In: CVPR (2020)

Choi, Y., Uh, Y., Yoo, J., Ha, J.W.: Stargan v2: Diverse image synthesis for mul- tiple domains. In: CVPR (2020)

2020

[12] [12]

In: CVPR (2014)

Cimpoi, M., Maji, S., Kokkinos, I., Mohamed, S., Vedaldi, A.: Describing textures in the wild. In: CVPR (2014)

2014

[13] [13]

arXiv preprint arXiv:1812.01718 (2018)

Clanuwat, T., Bober-Irizar, M., Kitamoto, A., Lamb, A., Yamamoto, K., Ha, D.: Deep learning for classical japanese literature. arXiv preprint arXiv:1812.01718 (2018)

Pith/arXiv arXiv 2018

[14] [14]

In: AISTATS (2011)

Coates, A., Ng, A., Lee, H.: An analysis of single-layer networks in unsupervised feature learning. In: AISTATS (2011)

2011

[15] [15]

In: IJCNN (2017)

Cohen, G., Afshar, S., Tapson, J., Schaik, A.v.: Emnist: Extending mnist to hand- written letters. In: IJCNN (2017)

2017

[16] [16]

Journal of Diabetes Science and Technology (2009) 16 J

Cuadros, J., Bresnick, G.: Eyepacs: An adaptable telemedicine system for diabetic retinopathy screening. Journal of Diabetes Science and Technology (2009) 16 J. Son et al

2009

[17] [17]

cats.https://kaggle.com/competitions/dogs-vs-cats (2013)

Cukierski, W.: Dogs vs. cats.https://kaggle.com/competitions/dogs-vs-cats (2013)

2013

[18] [18]

DeepNets: Landscape recognition.https : / / www . kaggle . com / datasets / utkarshsaxenadn/landscape-recognition-image-dataset-12k-images

[19] [19]

IEEE Signal Processing Magazine (SPM) (2012)

Deng, L.: The mnist database of handwritten digit images for machine learning research [best of the web]. IEEE Signal Processing Magazine (SPM) (2012)

2012

[20] [20]

In: ICLR (2021)

Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., Houlsby, N.: An image is worth 16x16 words: Transformers for image recognition at scale. In: ICLR (2021)

2021

[21] [21]

Computer Vision and Image Understanding (2007)

Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object cate- gories. Computer Vision and Image Understanding (2007)

2007

[22] [22]

In: NeurIPS (2022)

Feng, R., Zheng, K., Huang, Y., Zhao, D., Jordan, M., Zha, Z.J.: Rank diminishing in deep neural networks. In: NeurIPS (2022)

2022

[23] [23]

arXiv preprint arXiv:2412.00081 (2024)

Gargiulo, A.A., Crisostomi, D., Bucarelli, M.S., Scardapane, S., Silvestri, F., Rodolà, E.: Task singular vectors: Reducing task interference in model merging. arXiv preprint arXiv:2412.00081 (2024)

arXiv 2024

[24] [24]

In: ICONIP (2013)

Goodfellow, I.J., Erhan, D., Carrier, P.L., Courville, A., Mirza, M., Hamner, B., Cukierski, W., Tang, Y., Thaler, D., Lee, D.H., Zhou, Y., Ramaiah, C., Feng, F., Li, R., Wang, X., Athanasakis, D., Shawe-Taylor, J., Milakov, M., Park, J., Ionescu, R., Popescu, M., Grozea, C., Bergstra, J., Xie, J., Romaszko, L., Xu, B., Chuang, Z., Bengio, Y.: Challenges ...

2013

[25] [25]

Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset. Tech. rep., California Institute of Technology (2007)

2007

[26] [26]

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (2019)

Helber, P., Bischke, B., Dengel, A., Borth, D.: Eurosat: A novel dataset and deep learning benchmark for land use and land cover classification. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (2019)

2019

[27] [27]

In: CVPR (2015)

Horn, G.V., Branson, S., Farrell, R., Haber, S., Barry, J., Ipeirotis, P., Perona, P., Belongie, S.: Building a bird recognition app and large scale dataset with citizen scientists: The fine print in fine-grained dataset collection. In: CVPR (2015)

2015

[28] [28]

Howard, J.: Imagenette: A smaller subset of 10 easily classified classes from ima- genet.https://github.com/fastai/imagenette(2019)

2019

[29] [29]

In: NeurIPS (2024)

Huang, C., Ye, P., Chen, T., He, T., Yue, X., Ouyang, W.: Emr-merging: Tuning- free high-performance model merging. In: NeurIPS (2024)

2024

[30] [30]

In: ICLR (2023)

Ilharco, G., Tulio Ribeiro, M., Wortsman, M., Gururangan, S., Schmidt, L., Ha- jishirzi, H., Farhadi, A.: Editing models with task arithmetic. In: ICLR (2023)

2023

[31] [31]

In: ICLR (2025)

Iurada, L., Ciccone, M., Tommasi, T.: Efficient model editing with task-localized sparse fine-tuning. In: ICLR (2025)

2025

[32] [32]

In: ICLR (2023)

Jin, X., Ren, X., Preotiuc-Pietro, D., Cheng, P.: Dataless knowledge fusion by merging weights of language models. In: ICLR (2023)

2023

[33] [33]

Cell (2018)

Kermany, D.S., Goldbaum, M., Cai, W., Valentim, C.C.S., Liang, H., Baxter, S.L., McKeown, A., Yang, G., Wu, X., Yan, F., Dong, J., Prasadha, M., Pei, J., Ting, M., Zhu, J., Li, C., Hewett, S., Dong, J., Ziyar, I., Shi, A., Zhang, R., Gupta, K., Wong, R.M.Y., Lam, L.A.T.C., Cheung, J., Tsoi, M., Wu, V., Yan, C., Huang, C., Lee, D.H., Zhang, Y., Wong, J.W.,...

2018

[34] [34]

In: CVPR (2011) Training-free Task Classification for Multi-Task Model Merging 17

Khosla, A., Jayadevaprakash, N., Yao, B., Fei-Fei, L.: Novel dataset for fine-grained image categorization. In: CVPR (2011) Training-free Task Classification for Multi-Task Model Merging 17

2011

[35] [35]

In: AAAI (2020)

Khot, T., Clark, P., Guerquin, M., Jansen, P., Sabharwal, A.: Qasc: A dataset for question answering via sentence composition. In: AAAI (2020)

2020

[36] [36]

In: ICCV Workshop (2013)

Krause, J., Stark, M., Deng, J., Fei-Fei, L.: 3d object representations for fine- grained categorization. In: ICCV Workshop (2013)

2013

[37] [37]

Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)

2009

[38] [38]

In: ECCV (2012)

Kumar, N., Belhumeur, P.N., Biswas, A., Jacobs, D.W., Kress, W.J., Lopez, I.C., Soares, J.V.B.: Leafsnap: A computer vision system for automatic plant species identification. In: ECCV (2012)

2012

[39] [39]

Lab, M.A.: Bean disease dataset (2020),https://github.com/AI-Lab-Makerere/ ibean

2020

[40] [40]

IEEE Transactions on Pattern Analysis and Machine Intelligence (2005)

Lee, K.C., Ho, J., Kriegman, D.J.: Acquiring linear subspaces for face recogni- tion under variable lighting. IEEE Transactions on Pattern Analysis and Machine Intelligence (2005)

2005

[41] [41]

In: ICLR (2025)

Lee, Y., Jung, J., Baik, S.: Mitigating parameter interference in model merging via sharpness-aware fine-tuning. In: ICLR (2025)

2025

[42] [42]

In: KR (2012)

Levesque, H., Davis, E., Morgenstern, L.: The winograd schema challenge. In: KR (2012)

2012

[43] [43]

In: CVPR (2017)

Li, S., Deng, W., Du, J.: Reliable crowdsourcing and deep locality-preserving learn- ing for expression recognition in the wild. In: CVPR (2017)

2017

[44] [44]

arXiv preprint arXiv:2206.11404 (2022)

Liao, P., Li, X., Liu, X., Keutzer, K.: The artbench dataset: Benchmarking gener- ative models with artworks. arXiv preprint arXiv:2206.11404 (2022)

arXiv 2022

[45] [45]

Liu, Z., He, K.: A decade’s battle on dataset bias: Are we there yet? In: ICLR (2025)

2025

[46] [46]

In: NeurIPS (2024)

Lu, Z., Fan, C., Wei, W., Qu, X., Chen, D., Cheng, Y.: Twin-merging: Dynamic integration of modular expertise in model merging. In: NeurIPS (2024)

2024

[47] [47]

arXiv preprint arXiv:1306.5151 (2013)

Maji, S., Rahtu, E., Kannala, J., Blaschko, M., Vedaldi, A.: Fine-grained visual classification of aircraft. arXiv preprint arXiv:1306.5151 (2013)

Pith/arXiv arXiv 2013

[48] [48]

In: ICML (2025)

Marczak, D., Magistri, S., Cygert, S., Twardowski, B., Bagdanov, A.D., van de Weijer, J.: No task left behind: Isotropic model merging with common and task- specific subspaces. In: ICML (2025)

2025

[49] [49]

In: NeurIPS (2022)

Matena, M., Raffel, C.: Merging models with fisher-weighted averaging. In: NeurIPS (2022)

2022

[50] [50]

Frontiers in Plant Science (2016)

Mohanty, S.P., Hughes, D.P., Salathé, M.: Using deep learning for image-based plant disease detection. Frontiers in Plant Science (2016)

2016

[51] [51]

Acta Universitatis Sapientiae, Informatica (2018)

Muresan, H., Oltean, M.: Fruit recognition from images using deep learning. Acta Universitatis Sapientiae, Informatica (2018)

2018

[52] [52]

Nagaraj, A.: Asl alphabet.https://www.kaggle.com/datasets/grassknoted/ asl-alphabet

[53] [53]

In: NIPS Workshop (2011)

Netzer, Y., Coates, A., Wu, B., Wang, T., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning. In: NIPS Workshop (2011)

2011

[54] [54]

In: ICVGIP (2008)

Nilsback, M.E., Zisserman, A.: Automated flower classification over a large number of classes. In: ICVGIP (2008)

2008

[55] [55]

In: ICLR (2025)

Oh, C., Li, Y., Song, K., Yun, S., Han, D.: Dawin: Training-free dynamic weight interpolation for robust adaptation. In: ICLR (2025)

2025

[56] [56]

Sci- entific Reports (2019)

Olsen, A., Konovalov, D.A., Philippa, B., Ridd, P., Wood, J.C., Johns, J., Banks, W., Girgenti, B., Kenny, O., Whinney, J., Calvert, B., Rahimi Azghadi, M., White, R.D.: Deepweeds: A multiclass weed species image dataset for deep learning. Sci- entific Reports (2019)

2019

[57] [57]

In: NeurIPS (2023) 18 J

Ortiz-Jimenez, G., Favero, A., Frossard, P.: Task arithmetic in the tangent space: Improved editing of pre-trained models. In: NeurIPS (2023) 18 J. Son et al

2023

[58] [58]

In: CVPR (2012)

Parkhi, O.M., Vedaldi, A., Zisserman, A., Jawahar, C.V.: Cats and dogs. In: CVPR (2012)

2012

[59] [59]

In: ACM MMSys (2017)

Pogorelov, K., Randel, K.R., Griwodz, C., Eskeland, S.L., de Lange, T., Johansen, D., Spampinato, C., Dang-Nguyen, D.T., Lux, M., Schmidt, P.T., et al.: Kvasir: A multi-class image dataset for computer aided gastrointestinal disease detection. In: ACM MMSys (2017)

2017

[60] [60]

In: CVPR (2009)

Quattoni, A., Torralba, A.: Recognizing indoor scenes. In: CVPR (2009)

2009

[61] [61]

In: ICML (2021)

Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., Sutskever, I.: Learning transferable visual models from natural language supervision. In: ICML (2021)

2021

[62] [62]

Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019)

2019

[63] [63]

In: JMLR (2020)

Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.: Exploring the limits of transfer learning with a unified text-to-text transformer. In: JMLR (2020)

2020

[64] [64]

In: AAAI (2020)

Sakaguchi, K., Bras, R.L., Bhagavatula, C., Choi, Y.: Winogrande: An adversarial winograd schema challenge at scale. In: AAAI (2020)

2020

[65] [65]

In: ACL (2018)

Sharma, R., Allen, J., Bakhshandeh, O., Mostafazadeh, N.: Tackling the story ending biases in the story cloze test. In: ACL (2018)

2018

[66] [66]

arXiv preprint arXiv:2506.16506 (2025)

Skorobogat, R., Roth, K., Georgescu, M.I.: Subspace-boosted model merging. arXiv preprint arXiv:2506.16506 (2025)

arXiv 2025

[67] [67]

In: EMNLP (2013)

Socher, R., Perelygin, A., Wu, J., Chuang, J., Manning, C.D., Ng, A., Potts, C.: Recursive deep models for semantic compositionality over a sentiment treebank. In: EMNLP (2013)

2013

[68] [68]

In: IJCNN (2011)

Stallkamp, J., Schlipsing, M., Salmen, J., Igel, C.: The german traffic sign recog- nition benchmark: a multi-class classification competition. In: IJCNN (2011)

2011

[69] [69]

In: EMNLP (2019)

Tafjord, O., Gardner, M., Lin, K., Clark, P.: Quartz: An open-domain dataset of qualitative relationship questions. In: EMNLP (2019)

2019

[70] [70]

In: ICML (2024)

Tang, A., Shen, L., Luo, Y., Yin, N., Zhang, L., Tao, D.: Merging multi-task models via weight-ensembling mixture of experts. In: ICML (2024)

2024

[71] [71]

In: CVPR (2011)

Torralba, A., Efros, A.A.: Unbiased look at dataset bias. In: CVPR (2011)

2011

[72] [72]

arXiv preprint arXiv:2302.13971 (2023)

Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., Rodriguez, A., Joulin, A., Grave, E., Lample, G.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023)

Pith/arXiv arXiv 2023

[73] [73]

In: MICCAI (2018)

Veeling, B.S., Linmans, J., Winkens, J., Cohen, T., Welling, M.: Rotation equiv- ariant cnns for digital pathology. In: MICCAI (2018)

2018

[74] [74]

Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The caltech-ucsd birds-200-2011 dataset. Tech. rep., California Institute of Technology (2011)

2011

[75] [75]

In: ICML (2024)

Wang, K., Dimitriadis, N., Ortiz-Jimenez, G., Fleuret, F., Frossard, P.: Localizing task information for improved model merging and compression. In: ICML (2024)

2024

[76] [76]

In: ICLR (2022)

Wei, J., Bosma, M., Zhao, V.Y., Guu, K., Yu, A.W., Lester, B., Du, N., Dai, A.M., Le, Q.V.: Finetuned language models are zero-shot learners. In: ICLR (2022)

2022

[77] [77]

In: ICML (2025)

Wei, Y., Tang, A., Shen, L., Hu, Z., Yuan, C., Cao, X.: Modeling multi-task model merging as adaptive projective gradient descent. In: ICML (2025)

2025

[78] [78]

In: CVPR (2022)

Wortsman, M., Ilharco, G., Kim, J.W., Li, M., Kornblith, S., Roelofs, R., Gontijo- Lopes, R., Hajishirzi, H., Farhadi, A., Namkoong, H., Schmidt, L.: Robust fine- tuning of zero-shot models. In: CVPR (2022)

2022

[79] [79]

IEEE Transactions on Geoscience and Remote Sensing (2017) Training-free Task Classification for Multi-Task Model Merging 19

Xia, G.S., Hu, J., Hu, F., Shi, B., Bai, X., Zhong, Y., Zhang, L., Lu, X.: Aid: A benchmark data set for performance evaluation of aerial scene classification. IEEE Transactions on Geoscience and Remote Sensing (2017) Training-free Task Classification for Multi-Task Model Merging 19

2017

[80] [80]

Xiao, H., Zhang, F., Shen, Z., Wu, K., Zhang, J.: Classification of weather phe- nomenonfromimagesbyusingdeepconvolutionalneuralnetwork.EarthandSpace Science (2021)

2021