pith. sign in

arxiv: 2506.22726 · v3 · pith:S372HV3Znew · submitted 2025-06-28 · 💻 cs.CV · cs.LG

XTransfer: Modality-Agnostic Few-Shot Model Transfer for Human Sensing at the Edge

Pith reviewed 2026-05-19 07:52 UTC · model grok-4.3

classification 💻 cs.CV cs.LG
keywords few-shot transfermodality-agnosticmodel transferhuman sensingedge computingsensor modalitiesdeep learning adaptation
0
0 comments X

The pith

XTransfer allows pre-trained human sensing models to transfer across different sensor modalities using only a few examples.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Human sensing applications on edge devices require deep learning models that can run efficiently but collecting enough labeled data for each new sensor type or task is expensive. The paper introduces XTransfer to overcome this by taking models pre-trained on one type of sensor data and adapting them to a different type with minimal new data. It repairs the layers to account for the differences in how sensors capture information and then recombines selected layers from various source models to build a suitable new model. This results in performance that matches or beats existing methods while using much less data, training effort, and resources for deployment on edge hardware. If true, it would make it practical to deploy smart sensing in many more real-world scenarios where data is scarce.

Core claim

XTransfer is a modality-agnostic few-shot model transfer method for human sensing that flexibly uses pre-trained models and transfers knowledge across modalities by model repairing that adapts pre-trained layers with few sensor data to mitigate modality shift and layer recombining that searches and recombines layers from source models layer-wise to restructure models, achieving state-of-the-art performance while reducing costs of sensor data collection, model training, and edge deployment.

What carries the argument

model repairing to safely mitigate modality shift by adapting pre-trained layers with few sensor data combined with layer recombining to efficiently search and recombine layers of interest from source models in a layer-wise manner to restructure models for new modalities

If this is right

  • XTransfer achieves state-of-the-art performance across diverse human sensing datasets spanning different modalities.
  • It significantly reduces the costs associated with sensor data collection for new applications.
  • Model training becomes more efficient through the use of repaired and recombined layers rather than full retraining.
  • Edge deployment is facilitated by the resource-efficient design of the transferred models.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the approach holds, it could enable quick adaptation of sensing systems to new sensor types in the field without gathering large datasets.
  • Similar repair and recombine strategies might apply to other transfer learning problems where input domains differ substantially, such as adapting vision models to audio tasks.
  • Maintaining a shared pool of pre-trained layers from various modalities could become a standard practice for efficient edge AI development.

Load-bearing premise

That pre-trained layers from one sensing modality can be safely repaired and recombined with layers from other modalities using only few-shot target data without introducing unrecoverable performance degradation from modality shift.

What would settle it

Demonstrating a case where applying model repairing and layer recombining to a new modality pair results in lower accuracy than training a small model from scratch on the same few-shot data or observing severe degradation that cannot be recovered.

Figures

Figures reproduced from arXiv: 2506.22726 by Hong Jia, Hualin Zhou, Jianfei Yang, Shang Gao, Tao Gu, Xinyuan Chen, Xi Zhang, Yuankai Qi, Yu Zhang.

Figure 1
Figure 1. Figure 1: Preliminary study. (a) reveal baseline performance gap 1 [51]. (b) shows the average similarity and FSL difficulty across all sensing datasets (Tab. 2) to each source modality (e.g., Image, Text, Sensing) (c.f. Sec. 3). 2 distinct areas represent similarity levels (A–hard, B–normal). Key findings: 1) compared to CUB, similarity levels across modalities are notably low, e.g., Text and Sensing fall into Area… view at source ↗
Figure 2
Figure 2. Figure 2: Design insights. (a) Layer-wise accuracy convergence using baselines is disrupted due to modality shift. (b) A notable MMC shift emerges and grows with increasing layer index, i.e., accuracy increases while MMC shift drops in Area A and begins to drop as MMC shift largely grows in Area B. (c) After repairing, layer S-score improves, but stagnation occurs at certain layers. layer-wise misalignment and accur… view at source ↗
Figure 3
Figure 3. Figure 3: XTransfer overview. LWS control segments source models into layers and uses the pre-search check to decide if repairing is needed. SRR pipeline then fine-tunes connectors to repair selected layers. Finally, LWS control selects and recombines layers of interest into a compact model. The optimized component weights (i.e., projection coefficients) highlight the most important channels that contribute to the p… view at source ↗
Figure 4
Figure 4. Figure 4: Ablation study evaluating the performance of model repairing and layer recombining. [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: (a) Embedded mmWave radar testbed setup; (b)-(e) Built human sensing applications across different [PITH_FULL_IMAGE:figures/full_fig_p016_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Design insights. (a) shows layer-wise metric correlation. (b)(c) present efficient search insights into LWS control using multiple source models. B Technical details B.1 Default reshaping To align with source model input shape, we develop a default reshaping to transform sensor data shape. It uses bilinear interpolation [84] (i.e., Resizer) to resize the height and width, and a fixed convolutional layer (i… view at source ↗
Figure 7
Figure 7. Figure 7: Ablation study evaluating the performance of components, search parameters, and applications. [PITH_FULL_IMAGE:figures/full_fig_p018_7.png] view at source ↗
read the original abstract

Deep learning for human sensing on edge systems presents significant potential for smart applications. However, its training and development are hindered by the limited availability of sensor data and resource constraints of edge systems. While transferring pre-trained models to different sensing applications is promising, existing methods often require extensive sensor data and computational resources, resulting in high costs and limited transferability. In this paper, we propose XTransfer, a first-of-its-kind method enabling modality-agnostic, few-shot model transfer with resource-efficient design. XTransfer flexibly uses pre-trained models and transfers knowledge across different modalities by (i) model repairing that safely mitigates modality shift by adapting pre-trained layers with only few sensor data, and (ii) layer recombining that efficiently searches and recombines layers of interest from source models in a layer-wise manner to restructure models. We benchmark various baselines across diverse human sensing datasets spanning different modalities. The results show that XTransfer achieves state-of-the-art performance while significantly reducing the costs of sensor data collection, model training, and edge deployment.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces XTransfer, a modality-agnostic few-shot model transfer technique for human sensing on edge devices. It consists of (i) model repairing, which adapts pre-trained layers using limited target-domain sensor data to address modality shift, and (ii) layer recombining, which performs a layer-wise search to select and reassemble components from one or more source models. The authors benchmark the method against baselines on multiple human sensing datasets spanning vision, IMU, and audio modalities, claiming state-of-the-art accuracy together with substantial reductions in data collection, training, and deployment costs.

Significance. If the empirical claims are substantiated, the work would be significant for resource-constrained edge sensing applications. Enabling cross-modal transfer with only a few dozen labeled samples per target modality could materially lower the barrier to deploying deep models in domains where labeled data are expensive to acquire. The explicit focus on edge deployment cost is a practical strength not always emphasized in transfer-learning papers.

major comments (2)
  1. [§4.3, Table 3] §4.3 and Table 3: the central claim that model repairing 'safely mitigates modality shift' without unrecoverable degradation rests on the reported accuracy numbers, yet no ablation isolates the repair step from the subsequent recombining step, nor is a quantitative bound given on tolerable modality shift. Without these controls it is impossible to verify that the few-shot adaptation itself is responsible for the observed gains rather than the layer search.
  2. [§5.1] §5.1: the SOTA comparisons are presented as single-point estimates without error bars, standard deviations across random seeds, or statistical significance tests. Given that few-shot regimes are known to exhibit high variance, the reported margins over strong baselines cannot yet be treated as reliable.
minor comments (2)
  1. [§3.2] The description of the layer-recombining search objective in §3.2 would benefit from an explicit pseudocode listing or complexity analysis to clarify the computational cost of the search.
  2. [Figure 4] Figure 4 caption and axis labels should explicitly state the number of shots used in each few-shot setting so that readers can directly compare data-efficiency claims.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive comments on our manuscript. We address each of the major comments in detail below and outline the revisions we plan to make to strengthen the paper.

read point-by-point responses
  1. Referee: [§4.3, Table 3] §4.3 and Table 3: the central claim that model repairing 'safely mitigates modality shift' without unrecoverable degradation rests on the reported accuracy numbers, yet no ablation isolates the repair step from the subsequent recombining step, nor is a quantitative bound given on tolerable modality shift. Without these controls it is impossible to verify that the few-shot adaptation itself is responsible for the observed gains rather than the layer search.

    Authors: We appreciate the referee pointing out the need for clearer isolation of the model repairing component. Although the overall results support the effectiveness of the combined approach, we acknowledge that an explicit ablation would better substantiate the claim that repairing safely mitigates modality shift independently. In the revised manuscript, we will add a new ablation experiment that applies layer recombining both with and without the model repairing step on the same target data. This will allow direct comparison of the contribution of repairing. For the quantitative bound on tolerable modality shift, we will compute and report metrics such as the Wasserstein distance or KL divergence between source and target feature distributions for each modality pair, and correlate these with the observed performance to provide empirical guidance on the limits of the method. revision: yes

  2. Referee: [§5.1] §5.1: the SOTA comparisons are presented as single-point estimates without error bars, standard deviations across random seeds, or statistical significance tests. Given that few-shot regimes are known to exhibit high variance, the reported margins over strong baselines cannot yet be treated as reliable.

    Authors: We fully agree that single-point estimates are insufficient given the known variability in few-shot learning. To address this, we will conduct additional experiments by repeating the training and evaluation process over multiple random seeds and data splits. The revised results will include mean performance metrics with standard deviations. Furthermore, we will perform statistical significance testing (e.g., using the Wilcoxon signed-rank test or t-tests with Bonferroni correction) between XTransfer and the competing methods to validate that the improvements are statistically significant rather than due to chance. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical method with external benchmarks

full rationale

The paper presents XTransfer as an algorithmic proposal consisting of model repairing (adapting pre-trained layers with few-shot data) and layer recombining (searching and recombining layers across source models). These steps are described procedurally without equations that define performance metrics in terms of the method's own fitted outputs. Results are obtained by benchmarking against baselines on diverse external human-sensing datasets spanning modalities; no load-bearing claim reduces to a self-fit, self-citation chain, or renaming of inputs. The derivation chain is the method definition itself, which remains independent of the reported SOTA numbers. This matches the default expectation for an empirical transfer-learning paper whose central claims are falsifiable against held-out data.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No explicit free parameters, axioms, or invented entities are identifiable from the abstract; the method relies on standard transfer learning assumptions about pre-trained models and modality shift that are not detailed here.

pith-pipeline@v0.9.0 · 5738 in / 1073 out tokens · 24143 ms · 2026-05-19T07:52:53.092390+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

88 extracted references · 88 canonical work pages · 1 internal anchor

  1. [1]

    A survey of mmwave-based human sensing: Technology, platforms and applications

    Jia Zhang, Rui Xi, Yuan He, Yimiao Sun, Xiuzhen Guo, Weiguo Wang, Xin Na, Yunhao Liu, Zhenguo Shi, and Tao Gu. A survey of mmwave-based human sensing: Technology, platforms and applications. IEEE Communications Surveys & Tutorials, 2023. 1

  2. [2]

    Review of the theory, principles, and design requirements of human-centric internet of things (iot)

    Kaja Fjørtoft Ystgaard, Luigi Atzori, David Palma, Poul Einar Heegaard, Lene Elisabeth Bertheussen, Magnus Rom Jensen, and Katrien De Moor. Review of the theory, principles, and design requirements of human-centric internet of things (iot). Journal of Ambient Intelligence and Humanized Computing, 2023. 1

  3. [3]

    Applying internet of things and machine-learning for personalized healthcare: Issues and challenges

    Farhad Ahamed and Farnaz Farid. Applying internet of things and machine-learning for personalized healthcare: Issues and challenges. In iCMLDE ’18, 2018. 1

  4. [4]

    A survey on deep learning empowered iot applications

    Xiaoqiang Ma, Tai Yao, Menglan Hu, Yan Dong, Wei Liu, Fangxin Wang, and Jiangchuan Liu. A survey on deep learning empowered iot applications. IEEE Access, 2019. 1

  5. [5]

    Mdldroidlite: a release-and-inhibit control approach to resource-efficient deep neural networks on mobile devices

    Yu Zhang, Tao Gu, and Xi Zhang. Mdldroidlite: a release-and-inhibit control approach to resource-efficient deep neural networks on mobile devices. In SenSys ’20, 2020. 1

  6. [6]

    Deep learning on mobile and embedded devices: State-of-the-art, challenges, and future directions

    Yanjiao Chen, Baolin Zheng, Zihan Zhang, Qian Wang, Chao Shen, and Qian Zhang. Deep learning on mobile and embedded devices: State-of-the-art, challenges, and future directions. ACM Comput. Surv., 2020. 1

  7. [7]

    A survey of on-device machine learning: An algorithms and learning theory perspective

    Sauptik Dhar, Junyao Guo, Jiayi (Jason) Liu, Samarth Tripathi, Unmesh Kurup, and Mohak Shah. A survey of on-device machine learning: An algorithms and learning theory perspective. ACM Trans. Internet Things, 2021. 1

  8. [8]

    C. C. Sobin. A survey on architecture, protocols and challenges in iot. Wirel. Pers. Commun.,

  9. [9]

    Opportunities and challenges of wireless human sensing for the smart iot world: A survey

    Zijuan Liu, Xiulong Liu, Jiuwu Zhang, and Keqiu Li. Opportunities and challenges of wireless human sensing for the smart iot world: A survey. IEEE Network, 2019. 1

  10. [10]

    Lane and Petko Georgiev

    Nicholas D. Lane and Petko Georgiev. Can deep learning revolutionize mobile sensing? In HotMobile ’15, 2015. 1

  11. [11]

    Sensehar: A robust virtual activity sensor for smartphones and wearables

    Jeya Vikranth Jeyakumar, Liangzhen Lai, Naveen Suda, and Mani Srivastava. Sensehar: A robust virtual activity sensor for smartphones and wearables. In SenSys ’19, 2019. 1

  12. [12]

    Kempa-Liehr, and Kevin I-Kai Wang

    Hui Yie Teh, Andreas W. Kempa-Liehr, and Kevin I-Kai Wang. Sensor data quality: a systematic review. Journal of Big Data, 2020. 1

  13. [13]

    Leon Kröger, Philip R., and T

    J. Leon Kröger, Philip R., and T. Rahman B. Privacy implications of accelerometer data: A review of possible inferences. In ICCSP ’19, 2019. 1

  14. [14]

    Mdldroid: A chainsgd-reduce approach to mobile deep learning for personal mobile sensing

    Yu Zhang, Tao Gu, and Xi Zhang. Mdldroid: A chainsgd-reduce approach to mobile deep learning for personal mobile sensing. IEEE/ACM Trans. Netw., 2022. 1

  15. [15]

    A com- prehensive survey of few-shot learning: Evolution, applications, challenges, and opportunities

    Yisheng Song, Ting Wang, Puyu Cai, Subrota K Mondal, and Jyoti Prakash Sahoo. A com- prehensive survey of few-shot learning: Evolution, applications, challenges, and opportunities. ACM Comput. Surv., 2023. 1, 2, 15

  16. [16]

    Deep learning for sensor-based human activity recognition: Overview, challenges and opportunities

    Kaixuan Chen, Dalin Zhang, Lina Yao, Bin Guo, Zhiwen Yu, and Yunhao Liu. Deep learning for sensor-based human activity recognition: Overview, challenges and opportunities. 2020. 1

  17. [17]

    Jamal Deen, and Jiannong Cao

    Jiang Xiao, Huichuwu Li, Minrui Wu, Hai Jin, M. Jamal Deen, and Jiannong Cao. A survey on wireless device-free human sensing: Application scenarios, current solutions, and open issues. ACM Comput. Surv., 2022. 1

  18. [18]

    Metasense: Few-shot adaptation to untrained conditions in deep mobile sensing

    Taesik Gong, Yeonsu Kim, Jinwoo Shin, and Sung-Ju Lee. Metasense: Few-shot adaptation to untrained conditions in deep mobile sensing. SenSys ’19, 2019. 1, 2, 3, 7, 8, 9, 15

  19. [19]

    Review of few-shot learning application in csi human sensing

    Zhengjie Wang, Jianhang Li, Wenchao Wang, Zhaolei Dong, Qingwei Zhang, and Yinjing Guo. Review of few-shot learning application in csi human sensing. Artificial Intelligence Review,

  20. [20]

    Transfer learning in human activity recognition: A survey

    Sourish Gunesh Dhekane and Thomas Ploetz. Transfer learning in human activity recognition: A survey. 2024. 1, 3

  21. [21]

    Cross-domain har: Few-shot transfer learning for human activity recognition

    Megha Thukral, Harish Haresamudram, and Thomas Plötz. Cross-domain har: Few-shot transfer learning for human activity recognition. ACM Trans. Intell. Syst. Technol., 2025. 1, 3

  22. [22]

    Fewsense, towards a scalable and cross-domain wi-fi sensing system using few-shot learning

    Guolin Yin, Junqing Zhang, Guanxiong Shen, and Yingying Chen. Fewsense, towards a scalable and cross-domain wi-fi sensing system using few-shot learning. IEEE Transactions on Mobile Computing, 2024. 1, 3

  23. [23]

    Understanding cross-domain few-shot learning based on domain similarity and few-shot difficulty

    Jaehoon Oh, Sungnyun Kim, Namgyu Ho, Jin-Hwa Kim, Hwanjun Song, and Se-Young Yun. Understanding cross-domain few-shot learning based on domain similarity and few-shot difficulty. NeurIPS ’22, 2022. 2, 3, 7, 15

  24. [24]

    Powering finetuning in few-shot learning: Domain-agnostic bias reduction with selected sampling

    Ran Tao, Han Zhang, Yutong Zheng, and Marios Savvides. Powering finetuning in few-shot learning: Domain-agnostic bias reduction with selected sampling. AAAI ’22, 2022. 2

  25. [25]

    Sommelier: Curating dnn models for the masses

    Peizhen Guo, Bo Hu, and Wenjun Hu. Sommelier: Curating dnn models for the masses. SIGMOD ’22, 2022. 2

  26. [26]

    Mm-fi: multi-modal non-intrusive 4d human dataset for versatile wireless sensing

    Jianfei Yang, He Huang, Yunjiao Zhou, Xinyan Chen, Yuecong Xu, Shenghai Yuan, Han Zou, Chris Xiaoxuan Lu, and Lihua Xie. Mm-fi: multi-modal non-intrusive 4d human dataset for versatile wireless sensing. NeurIPS ’23, 2023. 2

  27. [27]

    X-fi: A modality-invariant foundation model for multimodal human sensing

    Xinyan Chen and Jianfei Yang. X-fi: A modality-invariant foundation model for multimodal human sensing. In ICLR ’25, 2025. 2

  28. [28]

    Proba- bilistic conformal distillation for enhancing missing modality robustness

    mengxi Chen, Fei Zhang, Zihua Zhao, Jiangchao Yao, Ya Zhang, and Yanfeng Wang. Proba- bilistic conformal distillation for enhancing missing modality robustness. In NeurIPS ’24, 2024. 2

  29. [29]

    Jointly modeling inter- & intra-modality dependencies for multi-modal learning

    Divyam Madaan, Taro Makino, Sumit Chopra, and Kyunghyun Cho. Jointly modeling inter- & intra-modality dependencies for multi-modal learning. In NeurIPS ’24, 2024. 2

  30. [30]

    Facilitating multimodal classification via dynamically learning modality gap

    Yang Yang, Fengqiang Wan, Qing-Yuan Jiang, and Yi Xu. Facilitating multimodal classification via dynamically learning modality gap. In NeurIPS ’24, 2024. 2

  31. [31]

    Multi-modal con- trastive learning for online clinical time-series applications

    Fabian Baldenweg, Manuel Burger, Gunnar Ratsch, and Rita Kuznetsova. Multi-modal con- trastive learning for online clinical time-series applications. InICLR 2024 Workshop on Learning from Time Series For Health, 2024. 2

  32. [32]

    Through-wall human pose estimation using radio signals

    Mingmin Zhao, Tianhong Li, Mohammad Abu Alsheikh, Yonglong Tian, Hang Zhao, Antonio Torralba, and Dina Katabi. Through-wall human pose estimation using radio signals. In CVPR ’18, 2018. 2

  33. [33]

    Facelistener: Recognizing human facial expressions via acoustic sensing on commodity headphones

    Xingzhe Song, Kai Huang, and Wei Gao. Facelistener: Recognizing human facial expressions via acoustic sensing on commodity headphones. In IPSN ’22, 2022. 2

  34. [34]

    Gurbuz, M

    Sevgi Z. Gurbuz, M. Mahbubur Rahman, Emre Kurtoglu, Trevor Macks, and Francesco Fio- ranelli. Cross-frequency training with adversarial learning for radar micro-Doppler signature classification (Rising Researcher). In Radar Sensor Technology XXIV, 2020. 2

  35. [35]

    mmFER: Millimetre-wave Radar based Facial Expression Recognition for Multimedia IoT Applications

    Xi Zhang, Yu Zhang, Zhenguo Shi, and Tao Gu. mmFER: Millimetre-wave Radar based Facial Expression Recognition for Multimedia IoT Applications. MobiCom ’23. 2023. 2

  36. [36]

    Imagebind: One embedding space to bind them all

    Rohit Girdhar, Alaaeldin El-Nouby, Zhuang Liu, Mannat Singh, Kalyan Vasudev Alwala, Armand Joulin, and Ishan Misra. Imagebind: One embedding space to bind them all. In CVPR ’23, 2023. 2

  37. [37]

    Next-gpt: Any-to-any multimodal llm

    Shengqiong Wu, Hao Fei, Leigang Qu, Wei Ji, and Tat-Seng Chua. Next-gpt: Any-to-any multimodal llm. CoRR, 2023. 2

  38. [38]

    Discover and publish models to a pre-trained model repository designed for research exploration, 2022

    PyTorch Hub. Discover and publish models to a pre-trained model repository designed for research exploration, 2022. 2, 7, 9, 15 11

  39. [39]

    A survey on negative transfer

    Wen Zhang, Lingfei Deng, Lei Zhang, and Dongrui Wu. A survey on negative transfer. IEEE/CAA Journal of Automatica Sinica, 2023. 2, 3

  40. [40]

    Multi-source distilling domain adaptation

    Sicheng Zhao, Guangzhi Wang, Shanghang Zhang, Yang Gu, Yaxian Li, Zhichao Song, Pengfei Xu, Runbo Hu, Hua Chai, and Kurt Keutzer. Multi-source distilling domain adaptation. In AAAI ’20, 2020. 2, 3, 7, 9

  41. [41]

    Sangiovanni-Vincentelli

    Xiangyu Yue, Zangwei Zheng, Colorado Reed, Hari Prasanna Das, Kurt Keutzer, and Alberto L. Sangiovanni-Vincentelli. Multi-source few-shot domain adaptation. CoRR, 2021. 2, 3

  42. [42]

    Learning new tricks from old dogs: Multi-source transfer learning from pre-trained networks

    Joshua Lee, Prasanna Sattigeri, and Gregory Wornell. Learning new tricks from old dogs: Multi-source transfer learning from pre-trained networks. In NeurIPS ’19, 2019. 2, 3, 7, 9

  43. [43]

    Neural architecture search: Insights from 1000 papers

    Colin White, Mahmoud Safari, Rhea Sukthanker, Binxin Ru, Thomas Elsken, Arber Zela, Debadeepta Dey, and Frank Hutter. Neural architecture search: Insights from 1000 papers

  44. [44]

    Adaptivenet: Post-deployment neural architecture adaptation for diverse edge environments

    Hao Wen, Yuanchun Li, Zunshuai Zhang, Shiqi Jiang, Xiaozhou Ye, Ye Ouyang, Yaqin Zhang, and Yunxin Liu. Adaptivenet: Post-deployment neural architecture adaptation for diverse edge environments. MobiCom ’23, 2023. 2, 3

  45. [45]

    Rui Han, Qinglong Zhang, Chi Harold Liu, Guoren Wang, Jian Tang, and Lydia Y . Chen. Legodnn: block-grained scaling of deep neural networks for mobile vision. MobiCom ’21,

  46. [46]

    Structured pruning of deep convolutional neural networks

    Sajid Anwar, Kyuyeon Hwang, and Wonyong Sung. Structured pruning of deep convolutional neural networks. J. Emerg. Technol. Comput. Syst., 2017. 2, 3, 17

  47. [47]

    The lottery ticket hypothesis: Finding sparse, trainable neural networks

    Jonathan Frankle and Michael Carbin. The lottery ticket hypothesis: Finding sparse, trainable neural networks. In ICLR ’19, 2019. 2, 3, 17, 18

  48. [48]

    Composition of saliency metrics for pruning with a myopic oracle

    Kaveena Persand, Andrew Anderson, and David Gregg. Composition of saliency metrics for pruning with a myopic oracle. In SSCI ’20, 2020. 2, 3

  49. [49]

    Model compression via distillation and quantization

    Antonio Polino, Razvan Pascanu, and Dan Alistarh. Model compression via distillation and quantization. ArXiv, abs/1802.05668, 2018. 2, 3

  50. [50]

    Maying Shen, Hongxu Yin, Pavlo Molchanov, Lei Mao, Jianna Liu, and Jose M. Alvarez. Structural pruning via latency-saliency knapsack. In NeurIPS ’22, 2022. 2, 18

  51. [51]

    Multivariate statistical machine learning methods for genomic prediction

    Osval Antonio Montesinos López, Abelardo Montesinos López, and José Crossa. Multivariate statistical machine learning methods for genomic prediction. Springer Nature, 2022. 3, 15

  52. [52]

    Domain-adaptive few-shot learning

    An Zhao, Mingyu Ding, Zhiwu Lu, Tao Xiang, Yulei Niu, Jiechao Guan, and Ji-Rong Wen. Domain-adaptive few-shot learning. In WACV ’21, 2021. 3, 7, 9, 15

  53. [53]

    Prototypical cross-domain self-supervised learning for few-shot unsupervised domain adaptation

    Xiangyu Yue, Zangwei Zheng, Shanghang Zhang, Yang Gao, Trevor Darrell, Kurt Keutzer, and Alberto Sangiovanni-Vincentelli. Prototypical cross-domain self-supervised learning for few-shot unsupervised domain adaptation. In CVPR ’21, 2021. 3

  54. [54]

    One fits all: Power general time series analysis by pretrained LM

    Tian Zhou, Peisong Niu, Xue Wang, Liang Sun, and Rong Jin. One fits all: Power general time series analysis by pretrained LM. NeurIPS ’23, 2023. 3, 7, 8, 9

  55. [55]

    Boosting the generalization capability in cross-domain few-shot learning via noise-enhanced supervised autoencoder

    Hanwen Liang, Qiong Zhang, Peng Dai, and Juwei Lu. Boosting the generalization capability in cross-domain few-shot learning via noise-enhanced supervised autoencoder. In ICCV ’21,

  56. [56]

    A closer look at few-shot classification

    Wei-Yu Chen, Yen-Cheng Liu, Zsolt Kira, Yu-Chiang Frank Wang, and Jia-Bin Huang. A closer look at few-shot classification. In ICLR ’19, 2019. 3, 7, 15

  57. [57]

    Gemel: Model merging for Memory-Efficient, Real-Time video analytics at the edge

    Arthi Padmanabhan, Neil Agarwal, Anand Iyer, Ganesh Ananthanarayanan, Yuanchao Shu, Nikolaos Karianakis, Guoqing Harry Xu, and Ravi Netravali. Gemel: Model merging for Memory-Efficient, Real-Time video analytics at the edge. NSDI ’23, 2023. 3 12

  58. [58]

    Once for all: Train one network and specialize it for efficient deployment

    Han Cai, Chuang Gan, Tianzhe Wang, Zhekai Zhang, and Song Han. Once for all: Train one network and specialize it for efficient deployment. ICLR ’20, 2020. 3

  59. [59]

    Neural fine-tuning search for few-shot learning

    Panagiotis Eustratiadis, Łukasz Dudziak, Da Li, and Timothy Hospedales. Neural fine-tuning search for few-shot learning. ICLR ’24, 2024. 3

  60. [60]

    Few-shot task-agnostic neural architecture search for distilling large language models

    Dongkuan Xu, Subhabrata Mukherjee, Xiaodong Liu, Debadeepta Dey, Wenhui Wang, Xiang Zhang, Ahmed Hassan Awadallah, and Jianfeng Gao. Few-shot task-agnostic neural architecture search for distilling large language models. In NeurIPS ’22, 2022. 3

  61. [61]

    A broad study on the transferability of visual representations with contrastive learning

    Ashraful Islam, Chun-Fu Richard Chen, Rameswar Panda, Leonid Karlinsky, Richard Radke, and Rogerio Feris. A broad study on the transferability of visual representations with contrastive learning. In ICCV ’21, 2021. 3

  62. [62]

    Channel importance matters in few-shot image classification

    Xu Luo, Jing Xu, and Zenglin Xu. Channel importance matters in few-shot image classification. In ICML ’22, 2022. 3

  63. [63]

    Cluster quality analysis using silhouette score

    Ketan Rajshekhar Shahapure and Charles Nicholas. Cluster quality analysis using silhouette score. In DSAA ’20, 2020. 4, 15

  64. [64]

    Improved deep metric learning with multi-class n-pair loss objective

    Kihyuk Sohn. Improved deep metric learning with multi-class n-pair loss objective. NIPS ’16,

  65. [65]

    Facenet: A unified embedding for face recognition and clustering

    Florian Schroff, Dmitry Kalenichenko, and James Philbin. Facenet: A unified embedding for face recognition and clustering. In CVPR ’15, 2015. 4, 7

  66. [66]

    Chapter 8 - from neural pca to deep unsupervised learning

    Harri Valpola. Chapter 8 - from neural pca to deep unsupervised learning. In Advances in Independent Component Analysis and Learning Machines. 2015. 4

  67. [67]

    Cosine similarity metric learning for face verification

    Hieu V Nguyen and Li Bai. Cosine similarity metric learning for face verification. In Asian conference on computer vision, 2010. 5

  68. [68]

    Transferability in deep learning: A survey

    Junguang Jiang, Yang Shu, Jianmin Wang, and Mingsheng Long. Transferability in deep learning: A survey. ArXiv, 2022. 7, 15, 16

  69. [69]

    Prototypical networks for few-shot learning

    Jake Snell, Kevin Swersky, and Richard Zemel. Prototypical networks for few-shot learning. In NeurIPS ’17, 2017. 7, 9

  70. [70]

    Model-agnostic meta-learning for fast adapta- tion of deep networks

    Chelsea Finn, Pieter Abbeel, and Sergey Levine. Model-agnostic meta-learning for fast adapta- tion of deep networks. In ICML ’17, 2017. 7, 9, 15

  71. [71]

    Matching networks for one shot learning

    Oriol Vinyals, Charles Blundell, Timothy Lillicrap, Daan Wierstra, et al. Matching networks for one shot learning. Advances in neural information processing systems, 29, 2016. 7

  72. [72]

    Domain adaptive ensemble learning

    Kaiyang Zhou, Yongxin Yang, Yu Qiao, and Tao Xiang. Domain adaptive ensemble learning. IEEE Transactions on Image Processing, 2021. 7

  73. [73]

    Adapting visual category models to new domains

    Kate Saenko, Brian Kulis, Mario Fritz, and Trevor Darrell. Adapting visual category models to new domains. In Computer Vision–ECCV 2010: 11th European Conference on Computer Vision, Heraklion, Crete, Greece, September 5-11, 2010, Proceedings, Part IV 11 , pages 213–226. Springer, 2010. 7

  74. [74]

    Deep hashing network for unsupervised domain adaptation

    Hemanth Venkateswara, Jose Eusebio, Shayok Chakraborty, and Sethuraman Panchanathan. Deep hashing network for unsupervised domain adaptation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 5018–5027, 2017. 7

  75. [75]

    Fergus, and P

    Li Fei-Fei, R. Fergus, and P. Perona. Learning generative visual models from few training ex- amples: An incremental bayesian approach tested on 101 object categories. In 2004 Conference on Computer Vision and Pattern Recognition Workshop, 2004. 7

  76. [76]

    Sunil Thulasidasan, Sushil Thapa, Sayera Dhaubhadel, Gopinath Chennupati, Tanmoy Bhat- tacharya, and Jeff A. Bilmes. An effective baseline for robustness to distributional shift. CoRR,

  77. [77]

    V oxceleb: a large-scale speaker identification dataset

    Arsha Nagrani, Joon Son Chung, and Andrew Zisserman. V oxceleb: a large-scale speaker identification dataset. CoRR, 2017. 7

  78. [78]

    Roggen, A

    D. Roggen, A. Calatroni, M. Rossi, T. Holleczek, K. Förster, G. Tröster, P. Lukowicz, D. Ban- nach, G. Pirkl, A. Ferscha, J. Doppler, C. Holzmann, M. Kurz, G. Holl, R. Chavarriaga, H. Sagha, H. Bayati, M. Creatura, and J. d. R. Millàn. Collecting complex activity datasets in highly rich networked sensor environments. In INSS’10, 2010. 7

  79. [79]

    Smart devices are different: Assessing and mitigatingmobile sensing heterogeneities for activity recognition

    Allan Stisen, Henrik Blunck, Sourav Bhattacharya, Thor Siiger Prentow, Mikkel Baun Kjær- gaard, Anind Dey, Tobias Sonne, and Mads Møller Jensen. Smart devices are different: Assessing and mitigatingmobile sensing heterogeneities for activity recognition. SenSys ’15,

  80. [80]

    Introducing wesad, a multimodal dataset for wearable stress and affect detection

    Philip Schmidt, Attila Reiss, Robert Duerichen, Claus Marberger, and Kristof Van Laerhoven. Introducing wesad, a multimodal dataset for wearable stress and affect detection. ICMI ’18,

Showing first 80 references.