Recognition: no theorem link
Pimp My LLM: Leveraging Variability Modeling to Tune Inference Hyperparameters
Pith reviewed 2026-05-16 06:42 UTC · model grok-4.3
The pith
A feature-based variability model captures LLM inference hyperparameter interactions to predict energy, latency, and accuracy from limited measurements.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Treating LLMs as configurable systems, a feature-based variability model is used to represent generation hyperparameters and constraints. Representative configurations are sampled, their energy consumption, latency, and accuracy are measured, and predictive models are learned that enable systematic analysis of hyperparameter effects and interactions, reveal trade-offs, and support prediction of inference behavior from a limited number of measurements.
What carries the argument
Feature-based variability model that represents generation hyperparameters and constraints for sampling and predictive modeling of inference metrics.
If this is right
- Systematic analysis of hyperparameter effects and interactions becomes feasible without exhaustive testing.
- Trade-offs between energy efficiency, latency, and accuracy can be identified and managed.
- Inference behavior for new configurations can be predicted from a small number of measurements.
- This supports more sustainable LLM deployment by optimizing settings efficiently.
Where Pith is reading between the lines
- The technique could extend to other resource-heavy ML tasks such as training or non-LLM inference to manage their configuration complexity.
- It might combine with automated search methods to reduce reliance on manual feature modeling.
- Hardware changes would likely require retraining the predictive models to keep forecasts accurate.
- Runtime adaptation of settings could use the predictions to respond to varying workloads.
Load-bearing premise
A feature-based variability model can accurately capture all relevant constraints and interactions among generation hyperparameters without missing important real-world behaviors or requiring excessive manual effort.
What would settle it
Measuring energy, latency, and accuracy on a new set of configurations outside the sampled ones and finding large prediction errors would show that the models fail to generalize.
Figures
read the original abstract
Large Language Models (LLMs) are being increasingly used across a wide range of tasks. However, their substantial computational demands raise concerns about the energy efficiency and sustainability of both training and inference. Inference, in particular, dominates total compute usage, making its optimization crucial. Recent research has explored optimization techniques and analyzed how configuration choices influence energy consumption. Yet, the vast configuration space of inference servers makes exhaustive empirical evaluation infeasible due to combinatorial explosion. In this paper, we introduce a new perspective on this problem by treating LLMs as configurable systems and applying variability management techniques to systematically analyze inference-time configuration choices. We evaluate our approach on the Hugging Face Transformers library by representing generation hyperparameters and their constraints using a feature-based variability model, sampling representative configurations, measuring their energy consumption, latency, accuracy, and learning predictive models from the collected data. Our results show that variability modeling effectively manages the complexity of LLM inference configurations. It enables systematic analysis of hyperparameters effects and interactions, reveals trade-offs, and supports prediction of inference behavior from a limited number of measurements. Overall, this work opens a new research direction that bridges software engineering and machine learning by leveraging variability modeling for the efficient and sustainable configuration of LLMs.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes modeling LLM inference hyperparameters (e.g., in Hugging Face Transformers) as a feature-based variability model to tame combinatorial explosion. Configurations are sampled, their energy consumption, latency, and accuracy are measured, and predictive models are trained on the resulting data to forecast inference behavior from a limited number of measurements. The central claim is that this variability-management approach enables systematic analysis of hyperparameter effects and interactions, reveals trade-offs, and supports accurate prediction without exhaustive evaluation.
Significance. If the predictive models generalize with low error on unsampled points, the work would usefully import variability-modeling techniques from software engineering into LLM configuration, offering a structured way to expose energy-accuracy-latency trade-offs and reduce measurement cost in large hyperparameter spaces.
major comments (2)
- [Abstract] Abstract: the claim that the method 'supports prediction of inference behavior from a limited number of measurements' is unsupported by any quantitative evidence. No cross-validation error, test-set MAE, R², or held-out generalization results are reported for the learned predictors, nor is the sampling strategy or model family described. This is load-bearing for the central contribution.
- [Abstract] The manuscript provides no validation that the feature model captures all relevant interactions (e.g., temperature-top-p coupling on diversity metrics) or that the sampled points adequately cover high-variance regions of the configuration space; without such checks the 'limited measurements' benefit cannot be assessed.
minor comments (1)
- [Abstract] The abstract states that 'variability modeling effectively manages the complexity' but does not define the concrete feature-model notation, constraint language, or sampling algorithm used.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. We address the major comments point by point below, agreeing where additional evidence is needed and outlining the revisions we will make.
read point-by-point responses
-
Referee: [Abstract] Abstract: the claim that the method 'supports prediction of inference behavior from a limited number of measurements' is unsupported by any quantitative evidence. No cross-validation error, test-set MAE, R², or held-out generalization results are reported for the learned predictors, nor is the sampling strategy or model family described. This is load-bearing for the central contribution.
Authors: We agree that the abstract's claim regarding prediction from limited measurements requires quantitative backing to be fully supported. The manuscript details a feature-model-based sampling approach (using combinatorial interaction testing to select representative configurations) and trains regression-based predictive models on the resulting energy, latency, and accuracy measurements. To strengthen this, we will revise the abstract to report key metrics such as cross-validation error, test-set MAE, and R² values from our held-out evaluations, along with a brief description of the model family and sampling strategy. This directly addresses the load-bearing concern without changing the core approach. revision: yes
-
Referee: [Abstract] The manuscript provides no validation that the feature model captures all relevant interactions (e.g., temperature-top-p coupling on diversity metrics) or that the sampled points adequately cover high-variance regions of the configuration space; without such checks the 'limited measurements' benefit cannot be assessed.
Authors: The feature model was derived from the Hugging Face Transformers API documentation and includes documented dependencies and interactions, such as the coupling between temperature and top-p that influences output diversity. Sampling employed variability-model techniques to ensure pairwise coverage of these interactions. We acknowledge that explicit post-sampling validation (e.g., quantitative checks on diversity metric variance or coverage of high-variance regions) is not detailed in the current version. We will add this analysis in the revised manuscript, including coverage statistics and interaction-effect measurements, to substantiate the benefit of limited measurements. revision: yes
Circularity Check
No circularity: empirical sampling and modeling from measured data
full rationale
The paper treats LLM inference as a configurable system, builds a feature-based variability model, samples configurations, measures energy/latency/accuracy, and trains predictive models on the resulting data. No equations, derivations, or self-citations reduce any claimed prediction to a quantity defined by the paper's own fitted parameters or inputs. The central results rest on external measurements and standard ML training rather than self-referential construction. This is a standard empirical SE/ML workflow with no load-bearing circular steps.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption LLM generation hyperparameters and their constraints can be represented using a feature-based variability model
Reference graph
Works this paper leans on
-
[1]
Mathieu Acher, José Galindo Duarte, and Jean-Marc Jézéquel. 2023. On program- ming variability with large language model-based assistant. InProceedings of the 27th International Systems and Software Product Line Conference-Vol. A. 8–14
work page 2023
-
[2]
Mathieu Acher and Jabier Martinez. 2023. Generative AI for reengineering variants into software product lines: an experience report. InProceedings of the 27th International Systems and Software Product Line Conference-Vol. B. 57–66
work page 2023
-
[3]
Halimeh Agh, Aidin Azamnouri, and Stefan Wagner. 2024. Software product line testing: a systematic literature review.Empirical Software Engineering29, 6 (2024), 146
work page 2024
-
[4]
Mustafa Al-Hajjaji, Jens Meinicke, Sebastian Krieter, Reimar Schröter, Thomas Thüm, Thomas Leich, and Gunter Saake. 2016. Tool demo: testing configurable systems with featureIDE. InProceedings of the 2016 ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences. 173–177
work page 2016
-
[5]
Negar Alizadeh, Boris Belchev, Nishant Saurabh, Patricia Kelbert, and Fernando Castor. 2025. Language Models in Software Development Tasks: An Experimental Analysis of Energy and Accuracy. In2025 IEEE/ACM 22nd International Conference on Mining Software Repositories (MSR). IEEE, 725–736
work page 2025
-
[6]
Juliana Alves Pereira, Mathieu Acher, Hugo Martin, and Jean-Marc Jézéquel
-
[7]
InProceedings of the ACM/SPEC International Conference on Performance Engineering
Sampling effect on performance prediction of configurable systems: A case study. InProceedings of the ACM/SPEC International Conference on Performance Engineering. 277–288
-
[8]
Esteban Garces Arias, Meimingwei Li, Christian Heumann, and Matthias Aßen- macher. 2025. Decoding decoded: Understanding hyperparameter effects in open-ended text generation. InProceedings of the 31st International Conference on Computational Linguistics. 9992–10020
work page 2025
-
[9]
Don Batory. 2005. Feature models, grammars, and propositional formulas. In International Conference on Software Product Lines. Springer, 7–20
work page 2005
-
[10]
Thorsten Berger, Ralf Rublack, Divya Nair, Joanne M Atlee, Martin Becker, Krzysztof Czarnecki, and Andrzej Wasowski. 2013. A survey of variability modeling in industrial practice. InProceedings of the 7th International Workshop on Variability Modelling of Software-intensive Systems. 1–8
work page 2013
-
[11]
Leo Breiman. 2001. Random forests.Machine learning45, 1 (2001), 5–32
work page 2001
-
[12]
Davide Brugali. 2020. Software product line engineering for robotics.Software Engineering for Robotics(2020), 1–28
work page 2020
-
[13]
Cécile Camillieri, Luca Parisi, Mireille Blay-Fornarino, Frédéric Precioso, Michel Riveill, and Joël Cancela-Vaz. 2016. Towards a software product line for machine learning workflows: Focus on supporting evolution. In10th Workshop on Models and Evolution co-located with ACM/IEEE 19th International Conference on Model Driven Engineering Languages and Syste...
work page 2016
- [14]
-
[15]
Lianping Chen, Muhammad Ali Babar, and Nour Ali. 2009. Variability manage- ment in software product lines: a systematic review. (2009)
work page 2009
-
[16]
Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde De Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, et al. 2021. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374(2021)
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[17]
Davide Chicco, Matthijs J Warrens, and Giuseppe Jurman. 2021. The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation.Peerj computer science7 (2021), e623
work page 2021
- [18]
-
[19]
Tristan Coignion, Clément Quinton, and Romain Rouvoy. 2025. When Faster Isn’t Greener: The Hidden Costs of LLM-Based Code Optimization. InASE’25-40th International Conference on Automated Software Engineering
work page 2025
-
[20]
Marco Couto, Paulo Borba, Jácome Cunha, João Paulo Fernandes, Rui Pereira, and João Saraiva. 2017. Products go green: Worst-case energy consumption in software product lines. InProceedings of the 21st International Systems and Software Product Line Conference-Volume A. 84–93
work page 2017
-
[21]
Krzysztof Czarnecki, Paul Grünbacher, Rick Rabiser, Klaus Schmid, and Andrzej Wasowski. 2012. Cool features and tough decisions: a comparison of variabil- ity modeling approaches. InProceedings of the 6th International workshop on variability modeling of software-intensive systems. 173–182
work page 2012
-
[22]
Krzysztof Czarnecki and Andrzej Wasowski. 2007. Feature diagrams and logics: There and back again. In11th International Software Product Line Conference (SPLC 2007). IEEE, 23–34
work page 2007
- [23]
-
[24]
Zhenxiao Fu, Fan Chen, Shan Zhou, Haitong Li, and Lei Jiang. 2025. Llmco2: Advancing accurate carbon footprint prediction for llm inferences.ACM SIGEN- ERGY Energy Informatics Review5, 2 (2025), 63–68
work page 2025
-
[25]
José A Galindo, David Benavides, Pablo Trinidad, Antonio-Manuel Gutiérrez- Fernández, and Antonio Ruiz-Cortés. 2019. Automated analysis of feature models: Quo vadis?Computing101, 5 (2019), 387–433
work page 2019
-
[26]
José A Galindo, Antonio J Dominguez, Jules White, and David Benavides. 2023. Large language models to generate meaningful feature model instances. In Proceedings of the 27th ACM International Systems and Software Product Line Conference-Volume A. 15–26
work page 2023
-
[27]
Javad Ghofrani, Ehsan Kozegar, Anna Lena Fehlhaber, and Mohammad Divband Soorati. 2019. Applying product line engineering concepts to deep neural net- works. InProceedings of the 23rd International Systems and Software Product Line Conference-Volume A. 72–77
work page 2019
-
[28]
Marcos Gomez-Vazquez and Jordi Cabot. 2024. Exploring the use of software product lines for the combination of machine learning models. InProceedings of the 28th ACM International Systems and Software Product Line Conference. 26–29
work page 2024
-
[29]
Édouard Guégain, Alexandre Bonvoisin, Mathieu Acher, Clément Quinton, and Romain Rouvoy. 2025. Exploring Performance of Configurable Software Systems: the JHipster Case Study. InEASE’25-29th International Conference on Evaluation and Assessment in Software Engineering
work page 2025
-
[30]
Édouard Guégain, Clément Quinton, and Romain Rouvoy. 2021. On reducing the energy consumption of software product lines. InProceedings of the 25th ACM International Systems and Software Product Line Conference-Volume A. 89–99
work page 2021
-
[31]
Jianmei Guo, Dingyu Yang, Norbert Siegmund, Sven Apel, Atrisha Sarkar, Pavel Valov, Krzysztof Czarnecki, Andrzej Wasowski, and Huiqun Yu. 2018. Data- efficient performance learning for configurable systems.Empirical Software Engineering23, 3 (2018), 1826–1867
work page 2018
-
[32]
Erik Johannes Husom, Arda Goknil, Merve Astekin, Lwin Khin Shar, Andre Kåsen, Sagar Sen, Benedikt Andreas Mithassel, and Ahmet Soylu. 2025. Sustainable llm inference for edge ai: Evaluating quantized llms for energy efficiency, output accuracy, and inference latency.arXiv preprint arXiv:2504.03360(2025)
-
[33]
Maxime Huyghe, Clément Quinton, and Walter Rudametkin. 2024. Taming the Variability of Browser Fingerprints. InProceedings of the 28th ACM International Systems and Software Product Line Conference (SPLC ’24). 66–71
work page 2024
-
[34]
Maxime Huyghe, Walter Rudametkin, and Clément Quinton. 2025. FP-Rainbow: Fingerprint-Based Browser Configuration Identification. InProceedings of the ACM on Web Conference 2025. 4325–4335
work page 2025
-
[35]
Akshaya Jagannadharao, Nicole Beckage, Sovan Biswas, Hilary Egan, Jamil Gafur, Thijs Metsch, Dawn Nafus, Giuseppe Raffa, and Charles Tripp. 2024. A Beginner’s Guide to Power and Energy Measurement and Estimation for Computing and Machine Learning.arXiv preprint arXiv:2412.17830(2024)
-
[36]
Martin Fagereng Johansen, Øystein Haugen, and Franck Fleurey. 2012. An algorithm for generating t-wise covering arrays from large feature models. In Proceedings of the 16th International Software Product Line Conference-Volume 1. 46–55
work page 2012
-
[37]
Zohra Kaouter Kebaili, Djamel Eddine Khelladi, Mathieu Acher, and Olivier Barais. 2024. An Empirical Study on Leveraging LLMs for Metamodels and Code Co-evolution. InEuropean Conference on Modelling Foundations and Applications (ECMFA 2024), Vol. 23. Journal of Object Technology, 1–14
work page 2024
-
[38]
Sebastian Krieter, Thomas Thüm, Sandro Schulze, Gunter Saake, and Thomas Leich. 2020. YASA: yet another sampling algorithm. InProceedings of the 14th International Working Conference on Variability Modelling of Software-Intensive Systems. 1–10
work page 2020
-
[39]
Elias Kuiter, Chico Sundermann, Thomas Thüm, Tobias Hess, Sebastian Krieter, and Gunter Saake. 2025. How configurable is the Linux Kernel? Analyzing two decades of feature-model history.ACM Transactions on Software Engineering and Methodology(2025)
work page 2025
-
[40]
Indika Kumara, Mohamed Hameez Ariz, Mohan Baruwal Chhetri, Majid Moham- madi, Willem-Jan Van Den Heuvel, and Damian A Tamburri. 2022. FOCloud: feature model guided performance prediction and explanation for deployment configurable cloud applications.IEEE Transactions on Services Computing16, 1 (2022), 302–314
work page 2022
-
[41]
Gonzalez, Hao Zhang, and Ion Stoica
Woosuk Kwon, Zhuohan Li, Siyuan Zhuang, Ying Sheng, Lianmin Zheng, Cody Hao Yu, Joseph E. Gonzalez, Hao Zhang, and Ion Stoica. 2023. Efficient Memory Management for Large Language Model Serving with PagedAtten- tion. InProceedings of the ACM SIGOPS 29th Symposium on Operating Systems Principles
work page 2023
-
[42]
Malgorzata Lazuka, Andreea Anghel, and Thomas Parnell. 2024. Llm-pilot: Characterize and optimize performance of your llm inference services. InSC24: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 1–18
work page 2024
-
[43]
Jiawei Liu, Chunqiu Steven Xia, Yuyao Wang, and Lingming Zhang. 2023. Is your code generated by chatgpt really correct? rigorous evaluation of large language models for code generation.Advances in Neural Information Processing Systems 36 (2023), 21558–21572
work page 2023
-
[44]
Noelia Lopez-Duran, David Romero Organvídez, Fermín Cruz, and David Be- navides. 2025. Configuration Bugs Classification using LLMs and Encoders. In Proceedings of the 2025 29th ACM International Systems and Software Product Line Conference-Volume A. 190–200
work page 2025
-
[45]
Sasha Luccioni, Yacine Jernite, and Emma Strubell. 2024. Power hungry process- ing: Watts driving the cost of AI deployment?. InProceedings of the 2024 ACM Nada Zine, Clément Quinton, and Romain Rouvoy conference on fairness, accountability, and transparency. 85–99
work page 2024
- [46]
-
[47]
Maíra Marques, Jocelyn Simmonds, Pedro O Rossel, and María Cecilia Bastarrica
-
[48]
Software product line evolution: A systematic literature review.Information and Software Technology105 (2019), 190–208
work page 2019
-
[49]
Matias Martinez. 2025. The impact of hyperparameters on large language model inference performance: An evaluation of vllm and huggingface pipelines. InPro- ceedings of the 33rd ACM International Conference on the Foundations of Software Engineering. 1672–1678
work page 2025
-
[50]
Alireza Nik, Michael A Riegler, and Pål Halvorsen. 2025. Impact of decoding strategies on GPU energy usage in large language model text generation.Scien- tific Reports(2025)
work page 2025
- [51]
-
[52]
Pratyush Patel, Esha Choukse, Chaojie Zhang, Íñigo Goiri, Brijesh Warrier, Nithish Mahalingam, and Ricardo Bianchini. 2024. Characterizing power man- agement opportunities for llms in the cloud. InProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3. 207–222
work page 2024
-
[53]
Juliana Alves Pereira, Mathieu Acher, Hugo Martin, Jean-Marc Jézéquel, Goetz Botterweck, and Anthony Ventresque. 2021. Learning software configuration spaces: A systematic literature review.Journal of Systems and Software182 (2021), 111044
work page 2021
-
[54]
Clément Quinton, Daniel Romero, and Laurence Duchien. 2016. SALOON: a platform for selecting and configuring cloud environments.Software: Practice and Experience46, 1 (2016), 55–78
work page 2016
-
[55]
Siddharth Samsi, Dan Zhao, Joseph McDonald, Baolin Li, Adam Michaleas, Michael Jones, William Bergeron, Jeremy Kepner, Devesh Tiwari, and Vijay Gadepally. 2023. From words to watts: Benchmarking the energy costs of large language model inference. In2023 IEEE High Performance Extreme Computing Conference (HPEC). IEEE, 1–9
work page 2023
-
[56]
Ina Schaefer, Rick Rabiser, Dave Clarke, Lorenzo Bettini, David Benavides, Goetz Botterweck, Animesh Pathak, Salvador Trujillo, and Karina Villela. 2012. Soft- ware diversity: state of the art and perspectives.International Journal on Software Tools for Technology Transfer14 (2012), 477–495
work page 2012
- [57]
-
[58]
Jovan Stojkovic, Esha Choukse, Chaojie Zhang, Inigo Goiri, and Josep Torrellas
- [59]
-
[60]
Jovan Stojkovic, Chaojie Zhang, Íñigo Goiri, Josep Torrellas, and Esha Choukse
-
[61]
In2025 IEEE International Symposium on High Performance Computer Architecture (HPCA)
Dynamollm: Designing llm inference clusters for performance and energy efficiency. In2025 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 1348–1362
-
[62]
Johannes Stümpfle, Devansh Atray, Nasser Jazdi, and Michael Weyrich. 2025. Large language model assisted transformation of software variants into a soft- ware product line. In2025 IEEE/ACM 22nd International Conference on Software and Systems Reuse (ICSR). IEEE, 12–20
work page 2025
-
[63]
Johannes Stümpfle, Sebastian Baum, Daniel Dittler, Nasser Jazdi, and Michael Weyrich. 2024. Automating Software Product Line Adoption Based on Feature Models Using Large Language Models. In2024 IEEE 29th International Conference on Emerging Technologies and Factory Automation (ETFA). IEEE, 1–4
work page 2024
-
[64]
2017.Learning-based performance specialization of configurable systems
Paul Temple, Mathieu Acher, Jean-Marc Jézéquel, Léo Noel-Baron, and José A Galindo. 2017.Learning-based performance specialization of configurable systems. Ph. D. Dissertation. IRISA, Inria Rennes; University of Rennes 1
work page 2017
-
[65]
Guadalupe-Isaura Trujillo-Tzanahua, Ulises Juárez-Martínez, Alberto-Alfonso Aguilar-Lasserre, María-Karen Cortés-Verdín, and Catherine Azzaro-Pantel. 2020. Multiple software product lines to configure applications of internet of things. IET Software14, 2 (2020), 165–175
work page 2020
-
[66]
Grant Wilkins, Srinivasan Keshav, and Richard Mortier. 2024. Offline energy- optimal llm serving: Workload-based energy models for llm inference on het- erogeneous systems.ACM SIGENERGY Energy Informatics Review4, 5 (2024), 113–119
work page 2024
-
[67]
Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement De- langue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, et al. 2019. Huggingface’s transformers: State-of-the-art natural language pro- cessing.arXiv preprint arXiv:1910.03771(2019)
work page internal anchor Pith review Pith/arXiv arXiv 2019
- [68]
-
[69]
Dan Zhao, Nathan C Frey, Joseph McDonald, Matthew Hubbell, David Bestor, Michael Jones, Andrew Prout, Vijay Gadepally, and Siddharth Samsi. 2022. A green (er) world for ai. In2022 IEEE International Parallel and Distributed Process- ing Symposium Workshops (IPDPSW). IEEE, 742–750
work page 2022
-
[70]
Wayne Xin Zhao, Kun Zhou, Junyi Li, Tianyi Tang, Xiaolei Wang, Yupeng Hou, Yingqian Min, Beichen Zhang, Junjie Zhang, Zican Dong, et al. 2023. A survey of large language models.arXiv preprint arXiv:2303.182231, 2 (2023)
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[71]
Zibin Zheng, Kaiwen Ning, Qingyuan Zhong, Jiachi Chen, Wenqing Chen, Lianghong Guo, Weicheng Wang, and Yanlin Wang. 2025. Towards an un- derstanding of large language models in software engineering tasks.Empirical Software Engineering30, 2 (2025), 50
work page 2025
-
[72]
Zixuan Zhou, Xuefei Ning, Ke Hong, Tianyu Fu, Jiaming Xu, Shiyao Li, Yuming Lou, Luning Wang, Zhihang Yuan, Xiuhong Li, et al. 2024. A survey on efficient inference for large language models.arXiv preprint arXiv:2404.14294(2024)
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[73]
Terry Yue Zhuo, Minh Chien Vu, Jenny Chim, Han Hu, Wenhao Yu, Ratnadira Widyasari, Imam Nur Bani Yusuf, Haolan Zhan, Junda He, Indraneil Paul, et al
-
[74]
Bigcodebench: Benchmarking code generation with diverse function calls and complex instructions.Preprint arXiv:2406.15877(2024)
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[75]
Nada Zine, Clément Quinton, and Romain Rouvoy. 2025. LLM-based Co- Evolution of Configurable Software Systems. InProceedings of the 2025 29th ACM International Systems and Software Product Line Conference-Volume A. 27– 38
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.