Recognition: 2 theorem links
· Lean TheoremActive Imitation Learning for Thermal- and Kernel-Aware LFM Inference on 3D S-NUCA Many-Cores
Pith reviewed 2026-05-10 16:27 UTC · model grok-4.3
The pith
Active imitation learning derives thermal-safe policies for LFM inference on 3D S-NUCA many-cores by imitating near-optimal oracle schedules.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
AILFM is an active imitation learning based scheduling framework that learns near-optimal thermal-aware scheduling policies from oracle demonstrations. It incorporates core-level performance heterogeneity and kernel-specific behavior in large foundation models to ensure thermal safety and maximize performance with minimal runtime overhead. Experiments demonstrate that it outperforms state-of-the-art baselines and generalizes across diverse LFM workloads on 3D S-NUCA systems.
What carries the argument
Active Imitation Learning (AIL) scheduler that imitates oracle policies for thread migration and V/f scaling, tailored to core heterogeneity and LFM kernel diversity.
If this is right
- Maintains thermal safety while maximizing performance on heterogeneous 3D S-NUCA many-cores.
- Adapts to diverse LFM kernels without high runtime overhead.
- Generalizes well to new LFM workloads beyond training examples.
- Outperforms state-of-the-art baselines in experiments.
Where Pith is reading between the lines
- Learning from oracles may replace hand-crafted models in other complex thermal management scenarios on many-cores.
- The framework could support production deployment for varied AI inference tasks on similar hardware.
- Similar imitation methods might address related problems like power management in heterogeneous systems.
Load-bearing premise
Oracle demonstrations of near-optimal policies exist and can be imitated with low runtime overhead while correctly capturing both core heterogeneity and kernel-specific LFM behavior to guarantee thermal safety.
What would settle it
If AILFM violates thermal limits or fails to outperform baselines when tested on a 3D S-NUCA system with a new LFM kernel not present in the oracle training data.
Figures
read the original abstract
Large Foundation Model (LFM) inference is both memory- and compute-intensive, traditionally relying on GPUs. However, the limited availability and high cost have motivated the adoption of high-performance general-purpose CPUs, especially emerging 3D-stacked Static Non-Uniform Cache Architecture (3D S-NUCA) systems. These architectures offer enhanced bandwidth and locality but suffer from severe thermal challenges and uneven cache latencies due to 3D Networks-on-Chip (NoC). Optimal management of thread migration and V/f scaling is non-trivial due to LFM kernel diversity and system heterogeneity. Existing thermal management approaches often rely on oversimplified analytical models and lack adaptability. We propose AILFM, an Active Imitation Learning (AIL)-based scheduling framework that learns near-optimal thermal-aware scheduling policies from Oracle demonstrations with minimal run-time overhead. AILFM accounts for both core-level performance heterogeneity and kernel-specific behavior in LFMs to maintain thermal safety while maximizing performance. Extensive experiments show that AILFM outperforms state-of-the-art baselines and generalizes well across diverse LFM workloads.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes AILFM, an active imitation learning framework for thermal- and kernel-aware scheduling of Large Foundation Model (LFM) inference on 3D S-NUCA many-core CPUs. It learns near-optimal policies for thread migration and V/f scaling from oracle demonstrations, explicitly accounting for core-level performance heterogeneity and kernel-specific LFM behaviors to maintain thermal safety while maximizing performance. The central claim is that extensive experiments demonstrate outperformance over state-of-the-art baselines together with strong generalization across diverse LFM workloads.
Significance. If the empirical results hold, the work could be significant for enabling cost-effective LFM deployment on general-purpose 3D-stacked CPU platforms instead of GPUs. It offers a learning-based alternative to oversimplified analytical thermal models and directly addresses heterogeneity and kernel diversity, which are increasingly relevant for AI systems. The use of imitation learning to transfer near-optimal policies with low runtime overhead is a promising direction if the oracles prove robust.
major comments (3)
- [Abstract] Abstract: the claim that 'extensive experiments show that AILFM outperforms state-of-the-art baselines and generalizes well' is unsupported by any metrics, baseline names, workload descriptions, or methodology details. This is load-bearing for the paper's primary contribution.
- [Oracle demonstrations] Oracle demonstrations section: the construction, optimality verification, and fidelity of the oracle demonstrations to actual thermal dynamics and core-to-core variation are not described in sufficient detail. Because the entire AILFM policy is obtained by imitating these oracles, the absence of this information prevents verification that the learned policy improves upon or safely exceeds existing methods.
- [Experimental evaluation] Experimental evaluation: no quantitative results, tables, or figures reporting performance, thermal safety margins, runtime overhead, or cross-workload generalization metrics are referenced, rendering the outperformance and generalization claims unverifiable.
Simulated Author's Rebuttal
We thank the referee for the thorough and constructive review of our manuscript. We appreciate the identification of areas where additional clarity is needed and will revise the paper to strengthen the presentation of our contributions.
read point-by-point responses
-
Referee: [Abstract] Abstract: the claim that 'extensive experiments show that AILFM outperforms state-of-the-art baselines and generalizes well' is unsupported by any metrics, baseline names, workload descriptions, or methodology details. This is load-bearing for the paper's primary contribution.
Authors: We agree that the abstract would be strengthened by greater specificity. In the revised version, we will expand the abstract to name the primary baselines (standard DVFS, heuristic migration, and analytical thermal-model schedulers), briefly characterize the LFM workloads (diverse inference kernels from models such as BERT and GPT variants), and report key quantitative outcomes (performance speedup, thermal safety margin, and cross-workload generalization rate) drawn from the experimental results already present in the body of the paper. revision: yes
-
Referee: [Oracle demonstrations] Oracle demonstrations section: the construction, optimality verification, and fidelity of the oracle demonstrations to actual thermal dynamics and core-to-core variation are not described in sufficient detail. Because the entire AILFM policy is obtained by imitating these oracles, the absence of this information prevents verification that the learned policy improves upon or safely exceeds existing methods.
Authors: We acknowledge the need for greater detail in the oracle section. We will revise this section to explicitly describe the oracle construction (offline optimization over thread-to-core mappings and V/f states using a cycle-accurate 3D thermal simulator), the verification procedure (comparison against exhaustive enumeration on reduced core counts and convergence to within a small optimality gap), and the fidelity checks (validation of simulated temperatures and per-core latency variation against hardware measurements on the target 3D S-NUCA platform). These additions will enable readers to assess the quality of the demonstrations used for imitation learning. revision: yes
-
Referee: [Experimental evaluation] Experimental evaluation: no quantitative results, tables, or figures reporting performance, thermal safety margins, runtime overhead, or cross-workload generalization metrics are referenced, rendering the outperformance and generalization claims unverifiable.
Authors: The manuscript already contains the requested quantitative results in dedicated tables and figures (performance and thermal comparisons, overhead measurements, and generalization analysis). However, we agree that explicit cross-references were insufficient. We will revise the experimental section to directly cite the relevant tables and figures when stating each result and will add a short summary paragraph that consolidates the key metrics for outperformance and generalization. This constitutes a partial revision focused on presentation rather than new data collection. revision: partial
Circularity Check
No circularity: empirical imitation-learning framework with external oracle
full rationale
The paper presents AILFM as an active imitation learning scheduler that learns thermal-aware policies from oracle demonstrations on heterogeneous 3D S-NUCA systems. All central claims (outperformance, generalization, thermal safety) rest on experimental comparisons to baselines rather than any mathematical derivation, equation, or first-principles result. No self-definitional loops, fitted inputs renamed as predictions, or load-bearing self-citations appear in the provided text. The oracle is treated as an external source of demonstrations; its construction is not shown to reduce to the learned policy itself. This is a standard empirical ML systems paper whose validity hinges on experimental evidence, not tautological reduction.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AlexanderDuality.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We propose AILFM, an Active Imitation Learning (AIL)-based scheduling framework that learns near-optimal thermal-aware scheduling policies from Oracle demonstrations...
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
The optimization objective is to minimize the LFM inference time while ensuring thermal safety... subject to T_peak(π_θ) ≤ T_th
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
InProceedings of the 29th ACM/IEEE International Symposium on Low Power Electronics and Design(2024), pp
Aghapour, E., Shen, Y., Sapra, D., Pimentel, A., and Pathania, A.Piqi: Partially quantized dnn inference on hmpsocs. InProceedings of the 29th ACM/IEEE International Symposium on Low Power Electronics and Design(2024), pp. 1–6
2024
-
[2]
InProceedings of the IEEE/CVF International Conference on Computer Vision(2025), pp
Bi, Q., Shen, Y., Yi, J., and Xia, G.-S.Adadcp: Learning an adapter with discrete cosine prior for clear-to-adverse domain generalization. InProceedings of the IEEE/CVF International Conference on Computer Vision(2025), pp. 12997–13008
2025
-
[3]
K., Zhang, W., and Srikanthan, T.Thermal-aware task scheduling for peak temperature minimization under periodic constraint for 3d-mpsocs
Chaturvedi, V., Singh, A. K., Zhang, W., and Srikanthan, T.Thermal-aware task scheduling for peak temperature minimization under periodic constraint for 3d-mpsocs. In2014 25nd IEEE International Symposium on Rapid System Prototyping(2014), IEEE
2014
-
[4]
Dosovitskiy, A.An image is worth 16x16 words: Transformers for image recognition at scale.arXiv preprint arXiv:2010.11929(2020)
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[5]
Ininternational conference on machine learning(2016), PMLR, pp
Gal, Y., and Ghahramani, Z.Dropout as a bayesian approximation: Represent- ing model uncertainty in deep learning. Ininternational conference on machine learning(2016), PMLR, pp. 1050–1059
2016
-
[6]
Gourdoumanis, G. R., Oikonomou, F., Pantazi-Kypraiou, M., Stoikos, P., Axelou, O., Tziouvaras, A., Karakonstantis, G., Aladwani, T., Anagnos- topoulos, C., Shen, Y., et al.Multi-partner project: Coin-3d–collaborative innovation in 3d vlsi reliability.arXiv preprint arXiv:2601.14347(2026)
-
[7]
Guo, D., Y ang, D., Zhang, H., Song, J., Zhang, R., Xu, R., Zhu, Q., Ma, S., W ang, P., Bi, X., et al.Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning.arXiv preprint arXiv:2501.12948(2025)
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[8]
In2018 IEEE international symposium on high performance computer architecture (HPCA)(2018), IEEE, pp
Hazelwood, K., Bird, S., , et al.Applied machine learning at facebook: A datacenter infrastructure perspective. In2018 IEEE international symposium on high performance computer architecture (HPCA)(2018), IEEE, pp. 620–629
2018
-
[9]
InProceedings of the IEEE conference on computer vision and pattern recognition(2016), pp
He, K., Zhang, X., Ren, S., and Sun, J.Deep residual learning for image recog- nition. InProceedings of the IEEE conference on computer vision and pattern recognition(2016), pp. 770–778
2016
- [10]
-
[11]
In2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)(2018), IEEE, pp
Henkel, J., Teich, J., Wildermann, S., and Amrouch, H.Dynamic resource management for heterogeneous many-cores. In2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)(2018), IEEE, pp. 1–6
2018
-
[12]
ACM Transactions on Embedded Computing Systems (TECS) 13, 1 (2013), 1–22
Hsieh, A.-C., and Hwang, T.Thermal-aware memory mapping in 3d designs. ACM Transactions on Embedded Computing Systems (TECS) 13, 1 (2013), 1–22
2013
-
[13]
InProceedings of the AAAI Conference on Artificial Intelligence(2025), vol
Huang, J.-H., Shen, Y., Zhu, H., Rudinac, S., and Kanoulas, E.Gradient weight- normalized low-rank projection for efficient llm training. InProceedings of the AAAI Conference on Artificial Intelligence(2025), vol. 39, pp. 24123–24131
2025
-
[14]
M., and Kanoulas, E.Optimizing numerical estimation and operational efficiency in the legal domain through large language models
Huang, J.-H., Y ang, C.-C., Shen, Y., Pacces, A. M., and Kanoulas, E.Optimizing numerical estimation and operational efficiency in the legal domain through large language models. InProceedings of the 33rd ACM International Conference on Information and Knowledge Management(2024), pp. 4554–4562
2024
-
[15]
InInternational Conference on Multimedia Modeling(2025), Springer, pp
Huang, J.-H., Zhu, H., Shen, Y., Rudinac, S., and Kanoulas, E.Im- age2text2image: A novel framework for label-free evaluation of image-to-text generation with text-to-image diffusion models. InInternational Conference on Multimedia Modeling(2025), Springer, pp. 413–427
2025
-
[16]
Huang, J.-H., Zhu, H., Shen, Y., Rudinac, S., Pacces, A. M., and Kanoulas, E.A novel evaluation framework for image2text generation.arXiv preprint arXiv:2408.01723(2024)
-
[17]
[18]Huh, J., Kim, C., Shafi, H., Zhang, L., Burger, D., and Keckler, S
Huang, W., Ghosh, S., Velusamy, S., et al.Hotspot: A compact thermal mod- eling methodology for early-stage vlsi design.IEEE Transactions on very large scale integration (VLSI) systems 14, 5 (2006), 501–513. [18]Huh, J., Kim, C., Shafi, H., Zhang, L., Burger, D., and Keckler, S. W.A nuca substrate for flexible cmp cache sharing. InACM International Confer...
2006
-
[18]
Meta llama 3 optimized cpu inference with hugging face and pytorch., 2024
Intel. Meta llama 3 optimized cpu inference with hugging face and pytorch., 2024
2024
-
[19]
Kenton, J. D. M.-W. C., and Toutanova, L. K.Bert: Pre-training of deep bidirectional transformers for language understanding. InProceedings of naacL- HLT(2019), vol. 1, Minneapolis, Minnesota, p. 2
2019
-
[20]
G., Choi, W., Chen, Z., Doppa, J
Kim, R. G., Choi, W., Chen, Z., Doppa, J. R., Pande, P. P., Marculescu, D., and Marculescu, R.Imitation learning for dynamic vfi control in large-scale manycore systems.IEEE Transactions on Very Large Scale Integration (VLSI) Systems 25, 9 (2017), 2458–2471
2017
-
[21]
Kumar, S. S., Zjajo, A., and van Leuken, R.Fighting dark silicon: Toward realizing efficient thermal-aware 3-d stacked multiprocessors.IEEE Transactions on Very Large Scale Integration (VLSI) Systems 25, 4 (2017), 1549–1562
2017
-
[22]
InProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2(2024), pp
Kuper, R., Jeong, I., Yuan, Y., et al.A quantitative analysis and guidelines of data streaming accelerator in modern intel xeon scalable processors. InProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2(2024), pp. 37–54
2024
-
[23]
Liu, H., Zhao, Y., Chen, X., Li, C., and Lu, J.Tb-nuca: A temperature-balanced 3d nuca based on bayesian optimization.Electronics 11, 18 (2022), 2910
2022
-
[24]
In2019 USENIX Annual Technical Conference (USENIX ATC 19)(2019), pp
Liu, Y., W ang, Y., Yu, R., et al.Optimizing {CNN} model inference on {CPUs }. In2019 USENIX Annual Technical Conference (USENIX ATC 19)(2019), pp. 1025– 1040
2019
-
[25]
In2016 Design, Automation & Test in Europe Conference & Exhibition (DATE)(2016), IEEE, pp
Lo, W.-H., Liang, K.-z., and Hwang, T.Thermal-aware dynamic page alloca- tion policy by future access patterns for hybrid memory cube (hmc). In2016 Design, Automation & Test in Europe Conference & Exhibition (DATE)(2016), IEEE, pp. 1084–1089
2016
-
[26]
K., Bhat, G., Patil, C
Mandal, S. K., Bhat, G., Patil, C. A., Doppa, J. R., Pande, P. P., and Ogras, U. Y. Dynamic resource management of heterogeneous mobile platforms via imitation learning.IEEE Transactions on Very Large Scale Integration (VLSI) Systems 27, 12 (2019), 2842–2854
2019
-
[27]
K.Optimizing energy efficiency of 3-d multicore systems with stacked dram under power and thermal constraints
Meng, J., Kawakami, K., and Coskun, A. K.Optimizing energy efficiency of 3-d multicore systems with stacked dram under power and thermal constraints. In Proceedings of the 49th Annual Design Automation Conference(2012), pp. 648–655
2012
-
[28]
Á.Sp-nuca: a cost effective dynamic non-uniform cache architecture.ACM SIGARCH Computer Architecture News 36, 2 (2008), 64–71
Merino, J., Puente, V., Prieto, P., and Gregorio, J. Á.Sp-nuca: a cost effective dynamic non-uniform cache architecture.ACM SIGARCH Computer Architecture News 36, 2 (2008), 64–71
2008
-
[29]
S., Al-Dhamari, A., et al.3d-dnape: Dynamic neighbor-aware performance enhancement for thermally constrained 3d many-core systems
Mohammed, M. S., Al-Dhamari, A., et al.3d-dnape: Dynamic neighbor-aware performance enhancement for thermally constrained 3d many-core systems. IEEE Access 11(2023), 131964–131978
2023
-
[30]
D.3d-ttp: Efficient transient temperature-aware power budgeting for 3d-stacked processor-memory systems
Niknam, S., Shen, Y., Pathania, A., and Pimentel, A. D.3d-ttp: Efficient transient temperature-aware power budgeting for 3d-stacked processor-memory systems. In2023 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) (2023), IEEE, pp. 1–6
2023
-
[31]
R.Neurotap: Thermal and memory access pattern- aware data mapping on 3d dram for maximizing dnn performance.ACM Trans- actions on Embedded Computing Systems 23, 6 (2024), 1–30
Pandey, S., and Panda, P. R.Neurotap: Thermal and memory access pattern- aware data mapping on 3d dram for maximizing dnn performance.ACM Trans- actions on Embedded Computing Systems 23, 6 (2024), 1–30
2024
-
[32]
In2018 Design, Automation & Test in Europe Conference & Exhibition (DATE)(2018), IEEE, pp
Pathania, A., and Henkel, J.Task scheduling for many-cores with s-nuca caches. In2018 Design, Automation & Test in Europe Conference & Exhibition (DATE)(2018), IEEE, pp. 557–562
2018
-
[33]
In2023 IEEE aerospace conference (2023), IEEE, pp
Perryman, N., Wilson, C., and George, A.Evaluation of xilinx versal architec- ture for next-gen edge computing in space. In2023 IEEE aerospace conference (2023), IEEE, pp. 1–11
2023
-
[34]
Gemma 2: Improving Open Language Models at a Practical Size
Research, G.Gemma 2: Improving open language models at a practical size. arXiv preprint arXiv:2408.00118(2024)
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[35]
Seeger, M.Gaussian processes for machine learning.International journal of neural systems 14, 02 (2004), 69–106
2004
- [36]
-
[37]
D., and Pathania, A
Shen, Y., Bi, Q., Huang, J.-H., Zhu, H., Pimentel, A. D., and Pathania, A. Macp: Minimal yet mighty adaptation via hierarchical cosine projection. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)(2025), pp. 20602–20618
2025
-
[38]
D., and Pathania, A.Ssh: Sparse spectrum adaptation via discrete hartley transformation
Shen, Y., Bi, Q., Huang, J.-H., Zhu, H., Pimentel, A. D., and Pathania, A.Ssh: Sparse spectrum adaptation via discrete hartley transformation. InProceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)(2025), pp. 10400–10415
2025
-
[39]
D., and Pathania, A.Efficient multimodal spatial reasoning via dynamic and asymmetric routing
Shen, Y., Bi, Q., W ang, Z., Y ang, Z., W ang, C., Zhang, Z., Tiwari, P., Pimentel, A. D., and Pathania, A.Efficient multimodal spatial reasoning via dynamic and asymmetric routing. InThe Fourteenth International Conference on Learning Representations(2026)
2026
-
[40]
D.Thermal manage- ment for s-nuca many-cores via synchronous thread rotations
Shen, Y., Niknam, S., Pathania, A., and Pimentel, A. D.Thermal manage- ment for s-nuca many-cores via synchronous thread rotations. In2023 Design, Automation & Test in Europe Conference & Exhibition (DATE)(2023), IEEE, pp. 1–6
2023
-
[41]
D.Thermal manage- ment for 3d-stacked systems via unified core-memory power regulation.ACM Transactions on Embedded Computing Systems 22, 5s (2023), 1–26
Shen, Y., Schreuders, L., Pathania, A., and Pimentel, A. D.Thermal manage- ment for 3d-stacked systems via unified core-memory power regulation.ACM Transactions on Embedded Computing Systems 22, 5s (2023), 1–26
2023
-
[42]
D.Tcps: a task and cache-aware partitioned scheduler for hard real-time multi-core systems
Shen, Y., Xiao, J., and Pimentel, A. D.Tcps: a task and cache-aware partitioned scheduler for hard real-time multi-core systems. InProceedings of the 23rd ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems(2022), pp. 37–49
2022
-
[43]
InProceedings of the 2025 International Conference on Artificial Intelligence and Computational Intelligence(2025), pp
Shen, Y., Zhang, H., Shen, Y., W ang, L., Shi, C., Du, S., and Tao, Y.Altgen: Ai-driven alt text generation for enhancing epub accessibility. InProceedings of the 2025 International Conference on Artificial Intelligence and Computational Intelligence(2025), pp. 78–83
2025
-
[44]
Siddhu, L., Kedia, R., et al.Comet: An integrated interval thermal simulation toolchain for 2d, 2.5 d, and 3d processor-memory systems.ACM Transactions on Architecture and Code Optimization (TACO) 19, 3 (2022), 1–25
2022
-
[45]
R.Leakage-aware dynamic thermal man- agement of 3d memories.ACM Transactions on Design Automation of Electronic Systems (TODAES)(2020)
Siddhu, L., Kedia, R., and Panda, P. R.Leakage-aware dynamic thermal man- agement of 3d memories.ACM Transactions on Design Automation of Electronic Systems (TODAES)(2020)
2020
-
[46]
B., Khdr, H., Rapp, M., and Henkel, J.Machine learning-based thermally-safe cache contention mitigation in clustered manycores
Sikal, M. B., Khdr, H., Rapp, M., and Henkel, J.Machine learning-based thermally-safe cache contention mitigation in clustered manycores. In2023 60th ACM/IEEE Design Automation Conference (DAC)(2023), IEEE, pp. 1–6
2023
-
[47]
In2024 4th International Conference on Artificial Intelligence, Robotics, and Communication (ICAIRC)(2024), IEEE, pp
Tao, Y., Shen, Y., Zhang, H., Shen, Y., W ang, L., Shi, C., and Du, S.Robustness of large language models against adversarial attacks. In2024 4th International Conference on Artificial Intelligence, Robotics, and Communication (ICAIRC)(2024), IEEE, pp. 182–185
2024
-
[48]
Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al.Llama: Open and efficient foundation language models.arXiv preprint arXiv:2302.13971(2023)
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[49]
InProceedings of the 27th Annual ACM Symposium on Applied Computing(2012)
Tsai, T.-H., and Chen, Y.-S.Thermal-aware real-time task scheduling for three- dimensional multicore chip. InProceedings of the 27th Annual ACM Symposium on Applied Computing(2012)
2012
-
[50]
InThe Thirty-ninth Annual Conference on Neural Information Processing Systems (2025)
W ang, C., He, S., Fang, X., Hu, Z., Huang, J.-H., Shen, Y., and Tiwari, P.Reason- ing beyond points: A visual introspective approach for few-shot 3d segmentation. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems (2025)
2025
-
[51]
M., Wolff, J., Shen, Y., Pathania, A., Grelck, C., and Pimentel, A
W asala, S. M., Wolff, J., Shen, Y., Pathania, A., Grelck, C., and Pimentel, A. D.Energy-efficient qos-aware scheduling for s-nuca many-cores. In2025 26th International Symposium on Quality Electronic Design (ISQED)(2025), IEEE, pp. 1–8
2025
-
[52]
In2019 IEEE international symposium on high performance computer architecture (HPCA)(2019), IEEE, pp
Wu, C.-J., Brooks, D., Chen, K., Chen, D., Choudhury, S., Dukhan, M., Hazel- wood, K., Isaac, E., Jia, Y., Jia, B., et al.Machine learning at facebook: Under- standing inference at the edge. In2019 IEEE international symposium on high performance computer architecture (HPCA)(2019), IEEE, pp. 331–344
2019
- [53]
-
[54]
Zhang, K., Guliani, A., Ogrenci-Memik, S., Memik, G., Yoshii, K., Sankaran, R., and Beckman, P.Machine learning-based temperature prediction for runtime thermal management across system components.IEEE Transactions on parallel and distributed systems 29, 2 (2017), 405–419
2017
-
[55]
InProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing(2025), pp
Zhang, Z., Shen, Y., Cao, C., and Shutova, E.Neuroada: Activating each neuron’s potential for parameter-efficient fine-tuning. InProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing(2025), pp. 10960–10977
2025
-
[56]
Zhu, H., Huang, J.-H., Shen, Y., Rudinac, S., and Kanoulas, E.Interactive image retrieval meets query rewriting with large language and vision language models.ACM Transactions on Multimedia Computing, Communications and Applications 21, 10 (2025), 1–23
2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.