Recognition: unknown
ImageHD: Energy-Efficient On-Device Continual Learning of Visual Representations via Hyperdimensional Computing
Pith reviewed 2026-05-09 22:31 UTC · model grok-4.3
The pith
ImageHD uses hyperdimensional computing on an FPGA to deliver up to 40x speedup and 383x energy savings for on-device continual learning of visual streams under tight memory limits.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
ImageHD implements a streaming dataflow architecture on the AMD Zynq ZCU104 that integrates HDC encoding, similarity search, and bounded cluster management using word-packed binary hypervectors for massively parallel bitwise computation. Combined with a compact quantized CNN for feature extraction and a unified exemplar memory plus hardware-efficient merging strategy, the system supports non-iterative online updates while staying inside strict on-chip memory and latency budgets. On the CORe50 dataset this yields up to 40.4x speedup and 383x energy efficiency over optimized CPU baselines and 4.84x speedup with 105.1x better energy use versus GPU baselines.
What carries the argument
The hardware-aware cluster merging strategy together with a fixed unified exemplar memory bound, executed via word-packed binary hypervectors that enable parallel bitwise operations inside the FPGA dataflow pipeline.
If this is right
- Continual learning becomes feasible for real-time visual streams on hardware that cannot store large exemplar sets or run gradient steps.
- Energy consumption drops enough to enable always-on adaptation in battery-powered edge cameras and sensors.
- The non-iterative HDC update path removes the latency spikes typical of backpropagation-based continual learners.
- A single on-chip memory budget suffices for both feature extraction and class representation, simplifying hardware design.
- Binary hypervector operations map directly to efficient FPGA bitwise logic, keeping resource use low.
Where Pith is reading between the lines
- The same bounded-memory HDC pattern could be tested on other sensor streams such as audio or IMU data if appropriate encoding functions are supplied.
- Larger on-chip memory in future FPGAs would reduce the frequency of merges and potentially raise accuracy without changing the algorithm.
- Replacing the current quantized CNN with an even lighter extractor might trade a small accuracy drop for further energy gains on the most constrained devices.
Load-bearing premise
The cluster merging and exemplar bounding will keep enough representative samples to maintain usable classification accuracy as new visual data arrives, without forcing extra off-chip accesses or post-hoc fixes.
What would settle it
Measure top-1 accuracy on CORe50 after a long sequence of new classes or distribution shifts while enforcing the stated on-chip memory ceiling; if accuracy falls sharply below the reported levels or the design requires off-chip traffic, the central claim does not hold.
Figures
read the original abstract
On-device continual learning (CL) is critical for edge AI systems operating on non-stationary data streams, but most existing methods rely on backpropagation or exemplar-heavy classifiers, incurring substantial compute, memory, and latency overheads. Hyperdimensional computing (HDC) offers a lightweight alternative through fast, non-iterative online updates. Combined with a compact convolutional neural network (CNN) feature extractor, HDC enables efficient on-device adaptation with strong visual representations. However, prior HDC-based CL systems often depend on multi-tier memory hierarchies and complex cluster management, limiting deployability on resource-constrained hardware. We present ImageHD, an FPGA accelerator for on-device continual learning of visual data based on HDC. ImageHD targets streaming CL under strict latency and on-chip memory constraints, avoiding costly iterative optimization. At the algorithmic level, we introduce a hardware-aware CL method that bounds class exemplars through a unified exemplar memory and a hardware-efficient cluster merging strategy, while incorporating a quantized CNN front-end to reduce deployment overhead without sacrificing accuracy. At the system level, ImageHD is implemented as a streaming dataflow architecture on the AMD Zynq ZCU104 FPGA, integrating HDC encoding, similarity search, and bounded cluster management using word-packed binary hypervectors for massively parallel bitwise computation within tight on-chip resource budgets. On CORe50, ImageHD achieves up to 40.4x (4.84x) speedup and 383x (105.1x) energy efficiency over optimized CPU (GPU) baselines, demonstrating the practicality of HDC-enabled continual learning for real-time edge AI.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces ImageHD, an FPGA accelerator for on-device continual learning of visual representations. It combines a quantized CNN front-end with hyperdimensional computing (HDC) for fast, non-iterative updates, proposing a hardware-aware continual learning method that uses a unified exemplar memory and hardware-efficient cluster merging to bound on-chip memory usage under streaming non-stationary data. The system is implemented as a streaming dataflow architecture on the AMD Zynq ZCU104 FPGA using word-packed binary hypervectors. On the CORe50 dataset, the work claims up to 40.4x speedup and 383x energy efficiency over optimized CPU baselines (and 4.84x / 105.1x over GPU), while satisfying strict latency and on-chip memory constraints for real-time edge AI.
Significance. If the accuracy and forgetting metrics under the bounded-exemplar strategy are shown to be competitive with unbounded HDC or standard CL baselines, the result would be significant for resource-constrained edge devices. The concrete FPGA implementation, use of massively parallel bitwise operations, and quantified efficiency gains over both CPU and GPU baselines represent concrete engineering contributions that could be directly useful for deploying continual learning on FPGAs.
major comments (2)
- [Abstract] Abstract and experimental claims: the central performance numbers (40.4x speedup, 383x energy) are presented without any accompanying classification accuracy, average forgetting rate, or ablation results comparing the unified exemplar memory + cluster merging strategy against unbounded HDC or standard replay-based CL methods. Because the hardware-aware bounding strategy is load-bearing for the on-device practicality claim, the absence of these metrics leaves the strongest claim unsupported.
- [Experimental Results] Experimental section (inferred from abstract): no error bars, standard deviations, or multiple-run statistics are referenced for the reported speedups and energy figures; likewise, no description of the exact CPU/GPU baseline implementations (e.g., whether they use the same quantized CNN or full-precision models) is provided, preventing assessment of whether the efficiency gains are robust or baseline-dependent.
minor comments (2)
- [Abstract] The abstract states concrete speedup and energy numbers yet provides no forward reference to the table or figure containing the corresponding accuracy results; adding such a pointer would improve readability.
- Notation for hypervector operations and cluster merging could be clarified with a small pseudocode block or explicit definition of the merging threshold, as the current description leaves the exact hardware-efficient strategy somewhat underspecified.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment below and outline the revisions we will make to strengthen the presentation of our results while preserving the core contributions of the work.
read point-by-point responses
-
Referee: [Abstract] Abstract and experimental claims: the central performance numbers (40.4x speedup, 383x energy) are presented without any accompanying classification accuracy, average forgetting rate, or ablation results comparing the unified exemplar memory + cluster merging strategy against unbounded HDC or standard replay-based CL methods. Because the hardware-aware bounding strategy is load-bearing for the on-device practicality claim, the absence of these metrics leaves the strongest claim unsupported.
Authors: We agree that pairing the efficiency claims with accuracy and forgetting metrics in the abstract would better support the on-device practicality argument. The full experimental evaluation in the manuscript reports classification accuracy and average forgetting on CORe50 for the bounded-exemplar configuration. To address the concern directly, we will revise the abstract to include representative accuracy and forgetting figures alongside the speedup and energy numbers. We will also add a short statement summarizing the ablation comparing the unified exemplar memory and cluster-merging strategy to unbounded HDC, confirming that the bounded approach preserves competitive accuracy under the reported memory constraints. revision: yes
-
Referee: [Experimental Results] Experimental section (inferred from abstract): no error bars, standard deviations, or multiple-run statistics are referenced for the reported speedups and energy figures; likewise, no description of the exact CPU/GPU baseline implementations (e.g., whether they use the same quantized CNN or full-precision models) is provided, preventing assessment of whether the efficiency gains are robust or baseline-dependent.
Authors: We acknowledge that the absence of error bars and explicit baseline details reduces the ability to assess robustness. In the revised manuscript we will report standard deviations obtained from multiple runs for all speedup and energy figures. We will also expand the experimental setup section with a precise description of the CPU and GPU baselines, clarifying that both use the identical quantized CNN front-end and HDC encoding pipeline as the FPGA implementation to ensure a fair comparison. revision: yes
Circularity Check
No circularity in empirical FPGA benchmarks or implementation claims
full rationale
The paper presents an FPGA accelerator implementation for HDC-based continual learning with direct experimental measurements on the CORe50 dataset, reporting concrete speedups (40.4x/4.84x) and energy gains (383x/105.1x) versus CPU/GPU baselines. These results are obtained from hardware execution rather than any derivation, prediction, or first-principles result that reduces to fitted parameters or self-referential definitions by construction. No equations, uniqueness theorems, or ansatzes are invoked that equate outputs to inputs; the central claims rest on measured on-chip performance under bounded memory, with no load-bearing self-citations or renaming of known results. The work is self-contained as a systems contribution.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Chen and B
Z. Chen and B. Liu,Lifelong Machine Learning, ser. Synthesis Lectures on Artificial Intelligence and Machine Learning. Morgan & Claypool Publishers, 2018, vol. 12, no. 3
2018
-
[2]
Expert gate: Lifelong learning with a network of experts,
R. Aljundi, P. Chakravarty, and T. Tuytelaars, “Expert gate: Lifelong learning with a network of experts,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 3366–3375
2017
-
[3]
Efficient lifelong learning with A-GEM,
A. Chaudhry, M. Ranzato, M. Rohrbach, and M. Elhoseiny, “Efficient lifelong learning with A-GEM,” inProceedings of the International Conference on Learning Representations (ICLR), 2018
2018
-
[4]
Continual lifelong learning with neural networks: A review,
G. I. Parisi, R. Kemker, J. L. Part, C. Kanan, and S. Wermter, “Continual lifelong learning with neural networks: A review,”Neural Networks, vol. 113, pp. 54–71, 2019
2019
-
[5]
The task rehearsal method of lifelong learning: Overcoming impoverished data,
D. L. Silver and R. E. Mercer, “The task rehearsal method of lifelong learning: Overcoming impoverished data,” inProceedings of the Confer- ence of the Canadian Society for Computational Studies of Intelligence, 2002, pp. 90–101
2002
-
[6]
Embracing change: Continual learning in deep neural networks,
R. Hadsell, D. Rao, A. A. Rusu, and R. Pascanu, “Embracing change: Continual learning in deep neural networks,”Trends in Cognitive Sci- ences, vol. 24, no. 12, pp. 1028–1040, 2020
2020
-
[7]
Lifelong machine learning with deep streaming linear discriminant analysis,
T. L. Hayes and C. Kanan, “Lifelong machine learning with deep streaming linear discriminant analysis,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). IEEE, 2020, pp. 220–221
2020
-
[8]
Ailis: effective hardware acceler- ator for incremental learning with intelligent selection in classification,
N. HosseinpourFardi and B. Alizadeh, “Ailis: effective hardware acceler- ator for incremental learning with intelligent selection in classification,” The Journal of Supercomputing, vol. 81, 02 2025
2025
-
[9]
Self-supervised models are continual learners,
E. Fini, V . G. T. da Costa, X. Alameda-Pineda, E. Ricci, K. Alahari, and J. Mairal, “Self-supervised models are continual learners,” inPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 9621–9630
2022
-
[10]
Representational continuity for unsupervised continual learning,
D. Madaan, J. Yoon, Y . Li, Y . Liu, and S. J. Hwang, “Representational continuity for unsupervised continual learning,” inInternational Con- ference on Learning Representations (ICLR), 2022
2022
-
[11]
Lifelong intelligence beyond the edge using hyperdimensional computing,
X. Yu, L. Gutierrez, A. Thomas, I. G. Moreno, and T. ˇS. Rosing, “Lifelong intelligence beyond the edge using hyperdimensional computing,” inProceedings of the 23rd ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN). IEEE, 2024, pp. 1–12. [Online]. Available: https://github.com/Orienfish/LifeHD
2024
-
[12]
Hyperdimensional computing: An introduction to comput- ing in distributed representation with high-dimensional random vectors,
P. Kanerva, “Hyperdimensional computing: An introduction to comput- ing in distributed representation with high-dimensional random vectors,” Cognitive Computation, vol. 1, no. 2, pp. 139–159, 2009
2009
-
[13]
J. Wang, S. Huang, and M. Imani,DistHD: A Learner- Aware Dynamic Encoding Method for Hyperdimensional Classification. IEEE Press, 2025, p. 1–6. [Online]. Available: https://doi.org/10.1109/DAC56929.2023.10247876
-
[14]
Visionhd: Towards efficient and privacy-preserved hyperdimensional computing for image data,
F. Asgarinejad, J. Morris, T. Rosing, and B. Aksanli, “Visionhd: Towards efficient and privacy-preserved hyperdimensional computing for image data,” inProceedings of the 29th ACM/IEEE International Symposium on Low Power Electronics and Design, ser. ISLPED ’24. New York, NY , USA: Association for Computing Machinery, 2024, p. 1–6. [Online]. Available: htt...
-
[15]
Hyperdimensional computing vs. neural networks: Comparing architecture and learning process,
D. Ma, C. Hao, and X. Jiao, “Hyperdimensional computing vs. neural networks: Comparing architecture and learning process,” in2024 25th International Symposium on Quality Electronic Design (ISQED). IEEE, 2024, pp. 1–5
2024
-
[16]
Hypergraf: Hyperdimensional graph-based reasoning acceleration on fpga,
H. Chen, A. Zakeri, F. Wen, H. E. Barkam, and M. Imani, “Hypergraf: Hyperdimensional graph-based reasoning acceleration on fpga,” inPro- ceedings of the 33rd International Conference on Field-Programmable Logic and Applications (FPL). Gothenburg, Sweden: IEEE, 2023, pp. 1–9
2023
-
[17]
A survey of on-device machine learning: An algorithms and learning theory perspective,
S. Dhar, J. Guo, J. J. Liu, S. Tripathi, U. Kurup, and M. Shah, “A survey of on-device machine learning: An algorithms and learning theory perspective,”ACM Trans. Internet Things, vol. 2, no. 3, Jul
-
[18]
Available: https://doi.org/10.1145/3450494
[Online]. Available: https://doi.org/10.1145/3450494
-
[19]
Online continual learning for embedded devices,
T. L. Hayes and C. Kanan, “Online continual learning for embedded devices,” inConference on Lifelong Learning Agents (CoLLAs). PMLR, 2022
2022
-
[20]
Edge intelligence: The confluence of edge computing and artificial intelligence,
S. Deng, H. Zhao, W. Fang, J. Yin, and A. Zomaya, “Edge intelligence: The confluence of edge computing and artificial intelligence,”IEEE Internet of Things Journal, vol. PP, pp. 1–1, 04 2020
2020
-
[21]
Edge computing for autonomous driving: Opportunities and challenges,
S. Liu, L. Liu, J. Tang, B. Yu, Y . Wang, and W. Shi, “Edge computing for autonomous driving: Opportunities and challenges,”Proceedings of the IEEE, vol. PP, pp. 1–20, 06 2019
2019
-
[22]
Online continual learning in image classification: An empirical survey,
Z. Mai, R. Li, J. Jeong, D. Quispe, H. Kim, and S. Sanner, “Online continual learning in image classification: An empirical survey,”Neuro- computing, vol. 469, pp. 28–51, 2022
2022
-
[23]
arXiv preprint arXiv:2506.16884 , year=
J. Graldi, A. Breccia, G. Lanzillotta, T. Hofmann, and L. Noci, “The importance of being lazy: Scaling limits of continual learning,”arXiv preprint arXiv:2506.16884, 2025
-
[24]
Mixture of experts meets prompt-based continual learning,
M. Le, A. Nguyen, H. Nguyen, T. Nguyen, T. Pham, L. Van Ngo, and N. Ho, “Mixture of experts meets prompt-based continual learning,”Ad- vances in Neural Information Processing Systems, vol. 37, pp. 119 025– 119 062, 2024
2024
-
[25]
On the transferability of parameter- efficient continual learning for vision transformers,
L. Ackermann and V .-L. Nguyen, “On the transferability of parameter- efficient continual learning for vision transformers,” inNeurIPS 2024 Workshop on Fine-Tuning in Modern Machine Learning: Principles and Scalability, 2024
2024
-
[26]
Online-lora: Task-free online continual learning via low rank adaptation,
X. Wei, G. Li, and R. Marculescu, “Online-lora: Task-free online continual learning via low rank adaptation,” in2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). IEEE, 2025, pp. 6634–6645
2025
-
[27]
Interactive continual learning: Fast and slow thinking,
B. Qi, X. Chen, J. Gao, D. Li, J. Liu, L. Wu, and B. Zhou, “Interactive continual learning: Fast and slow thinking,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 12 882–12 892
2024
-
[28]
C-clip: Multimodal continual learning for vision-language model,
W. Liu, F. Zhu, L. Wei, and Q. Tian, “C-clip: Multimodal continual learning for vision-language model,” inThe Thirteenth International Conference on Learning Representations, 2025
2025
-
[29]
Y . Liu, Q. Hong, L. Huang, A. Gomez-Villa, D. Goswami, X. Liu, J. van de Weijer, and Y . Tian, “Continual learning for vlms: A survey and taxonomy beyond forgetting,”arXiv preprint arXiv:2508.04227, 2025
-
[30]
Language guided concept bottleneck models for interpretable continual learning,
L. Yu, H. Han, Z. Tao, H. Yao, and C. Xu, “Language guided concept bottleneck models for interpretable continual learning,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recogni- tion (CVPR), June 2025, pp. 14 976–14 986
2025
-
[31]
Gradient episodic memory for contin- ual learning,
D. Lopez-Paz and M. Ranzato, “Gradient episodic memory for contin- ual learning,” inAdvances in Neural Information Processing Systems, vol. 30, 2017
2017
-
[32]
Enabling real-time inference in online continual learning via device-cloud collaboration,
H. Liu, C. Gong, Z. Zheng, S. Liu, and F. Wu, “Enabling real-time inference in online continual learning via device-cloud collaboration,” inProceedings of the ACM on Web Conference 2025, ser. WWW ’25. New York, NY , USA: Association for Computing Machinery, 2025, p. 2043–2052. [Online]. Available: https://doi.org/10.1145/3696410.3714796
-
[33]
Comparing energy efficiency of cpu, gpu and fpga implementations for vision kernels,
M. Qasaimeh, K. Denolf, J. Lo, K. Vissers, J. Zambreno, and P. Jones, “Comparing energy efficiency of cpu, gpu and fpga implementations for vision kernels,” 05 2019
2019
-
[34]
Laplace-hdc: Understanding the geometry of binary hyperdimensional computing,
S. Pourmand, W. D. Whiting, A. Aghasi, and N. F. Marshall, “Laplace-hdc: Understanding the geometry of binary hyperdimensional computing,”J. Artif. Intell. Res., vol. 82, pp. 1293–1323, 2024. [Online]. Available: https://api.semanticscholar.org/CorpusID:269157456
2024
-
[35]
Y . Zhou, X. Huang, C. Ni, M. Zhou, Z. Yan, X. Yin, and C. Zhuo, “Factorhd: A hyperdimensional computing model for multi-object multi-class representation and factorization,” 2025. [Online]. Available: https://arxiv.org/abs/2507.12366
-
[36]
Zynq ultrascale+ mpsoc zcu104 evaluation kit,
AMD Xilinx, “Zynq ultrascale+ mpsoc zcu104 evaluation kit,” https://www.xilinx.com/products/boards-and-kits/zcu104.html, accessed: Sep. 25, 2025
2025
-
[37]
Amd vitis unified ide, version 2024.2,
Advanced Micro Devices, Inc., “Amd vitis unified ide, version 2024.2,” 2024, release 2024.2
2024
-
[38]
Accelerating continual learn- ing on edge fpga,
D. Piyasena, S.-K. Lam, and M. Wu, “Accelerating continual learn- ing on edge fpga,” in2021 31st International Conference on Field- Programmable Logic and Applications (FPL), 2021, pp. 294–300
2021
-
[39]
Budget-restricted incremental learning with pre-trained convolutional neural networks and binary associative memories,
G. Boukli Hacene, V . Gripon, N. Farrugia, M. Arzel, and M. Jezequel, “Budget-restricted incremental learning with pre-trained convolutional neural networks and binary associative memories,”Journal of Signal Processing Systems, vol. 91, pp. 1063–1073, 2019
2019
-
[40]
Learning multiple layers of features from tiny images,
A. Krizhevsky, “Learning multiple layers of features from tiny images,” Technical Report, University of Toronto, 2009
2009
-
[41]
Core50: a new dataset and benchmark for continual object recognition,
V . Lomonaco and D. Maltoni, “Core50: a new dataset and benchmark for continual object recognition,” inProceedings of the 1st Annual Conference on Robot Learning, ser. Proceedings of Machine Learning Research, vol. 78. PMLR, 2017, pp. 17–26
2017
-
[42]
Unsupervised deep embedding for clustering analysis,
J. Xie, R. Girshick, and A. Farhadi, “Unsupervised deep embedding for clustering analysis,” inProceedings of The 33rd International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, M. F. Balcan and K. Q. Weinberger, Eds., vol. 48. New York, New York, USA: PMLR, 20–22 Jun 2016, pp. 478–487
2016
-
[43]
Onnx quantization,
“Onnx quantization,” https://onnxruntime.ai/docs/performance/
-
[44]
Quantization and training of neural networks for efficient integer-arithmetic-only inference,
B. Jacob, S. Kligys, B. Chen, M. Zhu, M. Tang, A. Howard, H. Adam, and D. Kalenichenko, “Quantization and training of neural networks for efficient integer-arithmetic-only inference,” in2018 IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition, 2018, pp. 2704–2713
2018
-
[45]
Working memory,
A. Baddeley, “Working memory,”Science, vol. 255, no. 5044, pp. 556– 559, 1992
1992
-
[46]
Catastrophic interference in connec- tionist networks: The sequential learning problem,
M. McCloskey and N. J. Cohen, “Catastrophic interference in connec- tionist networks: The sequential learning problem,” inPsychology of Learning and Motivation. Elsevier, 1989, vol. 24, pp. 109–165
1989
-
[47]
Overcoming catastrophic forgetting in neural networks,
J. Kirkpatrick, R. Pascanu, N. C. Rabinowitz, J. Veness, G. Desjardins, A. A. Rusu, K. Milan, J. Quan, T. Ramalho, A. Grabska-Barwinska, D. Hassabis, C. Clopath, D. Kumaran, and R. Hadsell, “Overcoming catastrophic forgetting in neural networks,”Proceedings of the National Academy of Sciences, vol. 114, no. 13, pp. 3521–3526, 2017
2017
-
[48]
Continual learning through synaptic intelligence,
F. Zenke, B. Poole, and S. Ganguli, “Continual learning through synaptic intelligence,” pp. 3987–3995, 2017
2017
-
[49]
icarl: Incremental classifier and representation learning,
S.-A. Rebuffi, A. Kolesnikov, G. Sperl, and C. H. Lampert, “icarl: Incremental classifier and representation learning,” inProceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2017, pp. 2001–2010
2017
-
[50]
A continual learning survey: Defying forgetting in classification tasks,
M. De Lange, R. Aljundi, M. Masana, S. Parisot, X. Jia, A. Leonardis, G. Slabaugh, and T. Tuytelaars, “A continual learning survey: Defying forgetting in classification tasks,”IEEE transactions on pattern analysis and machine intelligence, vol. 44, no. 7, pp. 3366–3385, 2021
2021
-
[51]
A comprehensive survey of continual learning: Theory, method and application,
L. Wang, X. Zhang, H. Su, and J. Zhu, “A comprehensive survey of continual learning: Theory, method and application,”IEEE transactions on pattern analysis and machine intelligence, vol. 46, no. 8, pp. 5362– 5383, 2024
2024
-
[52]
On-device transfer learning based on mixed precision partitioning
I. Topko, F. Kreß, A. Serdyuk, M. Stammler, T. Harbaum, and J. Becker, “On-device transfer learning based on mixed precision partitioning.”
-
[53]
Life- learner: Hardware-aware meta continual learning system for embedded computing platforms,
Y . D. Kwon, J. Chauhan, H. Jia, S. I. Venieris, and C. Mascolo, “Life- learner: Hardware-aware meta continual learning system for embedded computing platforms,” inProceedings of the 21st ACM Conference on Embedded Networked Sensor Systems, 2023, pp. 138–151
2023
-
[54]
[dl] a survey of fpga-based neural network inference accelerators,
K. Guo, S. Zeng, J. Yu, Y . Wang, and H. Yang, “[dl] a survey of fpga-based neural network inference accelerators,”ACM Transactions on Reconfigurable Technology and Systems (TRETS), vol. 12, no. 1, pp. 1–26, 2019
2019
-
[55]
Caffeine: Towards uniformed representation and acceleration for deep convolutional neural networks,
C. Zhang, Z. Fang, P. Zhou, P. Pan, and J. Cong, “Caffeine: Towards uniformed representation and acceleration for deep convolutional neural networks,” inProceedings of the 35th International Conference on Computer-Aided Design, 2016, pp. 1–8
2016
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.