Transfer Learning for Neural Parameter Estimation applied to Building RC Models
Pith reviewed 2026-05-10 20:06 UTC · model grok-4.3
The pith
Pretraining neural networks on simulated building models then fine-tuning on short real data yields 18-49% more accurate RC thermal parameter estimates without needing initial guesses.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors present a transfer-learning framework for neural parameter estimation that pretrains on simulated RC thermal models and fine-tunes on target data. This eliminates dependence on initial parameter guesses and delivers 18.6-24.0% performance gains with only 12 days of training data, rising to 49.4% with 72 days, outperforming a genetic algorithm baseline and a from-scratch neural estimator on eight simulated buildings, one real building, and two RC configurations.
What carries the argument
The pretraining-fine-tuning paradigm for a neural network that maps building temperature and input time series directly to RC model parameters.
If this is right
- Parameter estimation for building thermal dynamics becomes feasible with data collection periods as short as 12 days.
- The method removes the requirement for hand-tuned initial parameter guesses in non-convex identification problems.
- The same pretrained features support multiple RC model structures without full retraining.
- Performance improvements increase with additional fine-tuning data while remaining substantial at minimal lengths.
Where Pith is reading between the lines
- The same pretraining strategy could accelerate parameter identification in other physics-based dynamical systems where simulation is cheap but real measurements are limited.
- Domain randomization during pretraining may be the key ingredient that makes the transferred features robust to real sensor noise and unmodeled dynamics.
- Building energy managers could use short-term monitoring campaigns to calibrate digital twins far more rapidly than with conventional optimization.
Load-bearing premise
Pretraining on simulated RC models produces features that transfer reliably to real-world buildings and different configurations without large domain shift or overfitting to simulation artifacts.
What would settle it
If a neural network trained from scratch on the same real building data consistently matches or exceeds the accuracy of the pretrained-and-fine-tuned network on held-out test periods, the claimed transfer benefit would be falsified.
Figures
read the original abstract
Parameter estimation for dynamical systems remains challenging due to non-convexity and sensitivity to initial parameter guesses. Recent deep learning approaches enable accurate and fast parameter estimation but do not exploit transferable knowledge across systems. To address this, we introduce a transfer-learning-based neural parameter estimation framework based on a pretraining-fine-tuning paradigm. This approach improves accuracy and eliminates the need for an initial parameter guess. We apply this framework to building RC thermal models, evaluating it against a Genetic Algorithm and a from-scratch neural baseline across eight simulated buildings, one real-world building, two RC model configurations, and four training data lengths. Results demonstrate an 18.6-24.0% performance improvement with only 12 days of training data and up to 49.4% with 72 days. Beyond buildings, the proposed method represents a new paradigm for parameter estimation in dynamical systems.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a transfer-learning framework for neural parameter estimation in building RC thermal models. A neural network is pretrained on simulated RC models and then fine-tuned on limited data (12-72 days) from either simulated or real buildings to estimate parameters without requiring an initial guess. It is evaluated against a genetic algorithm and a from-scratch neural baseline on eight simulated buildings plus one real building, using two RC topologies and four data lengths, with reported performance gains of 18.6-24.0% for short data and up to 49.4% for longer data.
Significance. If the transfer reliably improves accuracy and generalizes, the approach could reduce data and compute requirements for parameter estimation in building energy models and similar non-convex dynamical-system identification tasks. The empirical gains on limited data are potentially useful, but the single real-building case and absence of domain-shift diagnostics limit the strength of the broader 'new paradigm' claim.
major comments (3)
- [Section 4] Section 4 (Experiments and Results): Evaluation uses only one real-world building and two RC topologies. The central transfer-learning claim requires evidence that pretrained features are invariant to domain shift; with this narrow test set the reported 18.6-49.4 % gains could be explained by optimizer robustness rather than transferable representations. An explicit domain-adaptation metric or feature-alignment check is needed.
- [Section 3] Section 3 (Proposed Method): No equations or diagrams detail the neural architecture, pretraining loss, fine-tuning procedure, or how the encoder is frozen/unfrozen. Without these, the 18.6-24.0 % improvement cannot be reproduced or isolated from hyper-parameter choices.
- [Section 4] Section 4: Results are presented without error bars, statistical significance tests, or validation-split details. The performance numbers in the abstract therefore cannot be assessed for robustness across random seeds or data partitions.
minor comments (2)
- [Abstract] Abstract: The phrase 'performance improvement' is undefined; specify the exact metric (e.g., RMSE on parameter estimates or prediction error) and the baseline value.
- [Introduction] The manuscript would benefit from a short related-work paragraph contrasting the proposed pretrain-fine-tune scheme with existing transfer-learning methods for system identification.
Simulated Author's Rebuttal
We thank the referee for the constructive comments, which help clarify the scope and strengthen the presentation of our transfer-learning framework for neural parameter estimation. We address each major comment below with specific revisions.
read point-by-point responses
-
Referee: [Section 4] Section 4 (Experiments and Results): Evaluation uses only one real-world building and two RC topologies. The central transfer-learning claim requires evidence that pretrained features are invariant to domain shift; with this narrow test set the reported 18.6-49.4 % gains could be explained by optimizer robustness rather than transferable representations. An explicit domain-adaptation metric or feature-alignment check is needed.
Authors: We acknowledge the narrow real-world scope limits broader claims. The eight simulated buildings vary in thermal parameters and the two RC topologies (2R2C, 3R3C) test transfer under controlled shifts, but a single real building is insufficient to fully rule out optimizer effects. In revision we will add t-SNE visualizations of encoder latent features on simulated pretraining data versus the real building, plus a quantitative metric (Wasserstein distance between feature distributions pre- and post-fine-tuning). We will also moderate the abstract and conclusion language regarding a 'new paradigm' to better reflect the current evidence. revision: partial
-
Referee: [Section 3] Section 3 (Proposed Method): No equations or diagrams detail the neural architecture, pretraining loss, fine-tuning procedure, or how the encoder is frozen/unfrozen. Without these, the 18.6-24.0 % improvement cannot be reproduced or isolated from hyper-parameter choices.
Authors: We apologize for the omission. The revised Section 3 will contain: (1) a diagram of the encoder-decoder architecture (LSTM-based encoder with fully-connected parameter head); (2) the pretraining loss as mean-squared error on ground-truth RC parameters from simulated trajectories; (3) the fine-tuning loss and optimizer schedule; and (4) the explicit procedure that the encoder remains frozen for the first 10 epochs then is unfrozen for joint training. These additions will enable exact reproduction and attribution of gains to transfer. revision: yes
-
Referee: [Section 4] Section 4: Results are presented without error bars, statistical significance tests, or validation-split details. The performance numbers in the abstract therefore cannot be assessed for robustness across random seeds or data partitions.
Authors: We agree these statistics are required. The revised results will report mean and standard deviation over five independent runs (different random seeds for pretraining, fine-tuning, and data order). We will add Wilcoxon signed-rank tests with p-values against both baselines. Data splits will be detailed: for each length (12–72 days) the first 70 % is training, next 15 % validation, final 15 % test, preserving temporal order to prevent leakage. revision: yes
- The manuscript contains only one real-world building; additional real-building datasets cannot be acquired for this revision.
Circularity Check
No significant circularity; empirical results independent of inputs
full rationale
The manuscript introduces a transfer-learning neural framework for RC model parameter estimation and reports empirical gains (18.6–49.4 %) versus a genetic algorithm and a from-scratch baseline on eight simulated buildings plus one real building. No equations, derivations, or self-referential fitting steps are described that would reduce any claimed prediction to its own training inputs by construction. Performance numbers arise from direct experimental comparison rather than from an ansatz, uniqueness theorem, or parameter fit that is then re-labeled as a prediction. Self-citations, if present, are not load-bearing for the central empirical claim.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Neural networks can be pretrained on simulated dynamical systems to extract transferable features for parameter estimation
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
transfer-learning-based neural parameter estimation framework based on a pretraining-fine-tuning paradigm... applied to building RC thermal models
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Pytorch documentation, autograd mechanics, 2022
work page 2022
-
[2]
Krzysztof Arendt, Muhyiddine Jradi, Michael Wetter, and Christian T. Veje. ModestPy: An Open-Source Python Tool for Parameter Estima- tion in Functional Mock-up Units. pages 121–130, February 2019
work page 2019
-
[3]
Standard method of test for the evaluation of building energy analysis computer programs, 2004
ASHRAE. Standard method of test for the evaluation of building energy analysis computer programs, 2004
work page 2004
-
[4]
Richard C Aster, Brian Borchers, and Clifford H Thurber.Parameter estimation and inverse problems. Elsevier, 2018
work page 2018
-
[5]
Peder Bacher and Henrik Madsen. Identifying suitable models for the heat dynamics of buildings.Energy and Buildings, 43(7):1511–1522, July 2011
work page 2011
-
[6]
D.H. Blum, K. Arendt, L. Rivalin, M.A. Piette, M. Wetter, and C.T. Veje. Practical factors of envelope model setup and their effects on the performance of model predictive control for building heating, ventilating, and air conditioning systems.Applied Energy, 2019
work page 2019
-
[7]
Roel De Coninck and Lieve Helsen. Practical implementation and evaluation of model predictive control for an office building in brussels.Energy and Buildings, 111:290–298, 2016
work page 2016
-
[8]
Ralf Dott, Michel Haller, J ¨orn Ruschenburg, Fabian Ochs, and Jacques Bony. The reference framework for system simulations of the iea shc task 44 / hpp annex 38 part b: Buildings and space heat load, 2014
work page 2014
-
[9]
J ´an Drgo ˇna, Javier Arroyo, Iago Cupeiro Figueroa, David Blum, Krzysztof Arendt, Donghun Kim, Enric Perarnau Oll ´e, Juraj Oravec, Michael Wetter, Draguna L. Vrabie, and Lieve Helsen. All you need to know about model predictive control for buildings.Annual Reviews in Control, 50:190–232, 2020
work page 2020
-
[10]
Tuor, Vikas Chandan, and Draguna L
J ´an Drgo ˇna, Aaron R. Tuor, Vikas Chandan, and Draguna L. Vrabie. Physics-constrained deep learning of multi-zone building thermal dynamics.Energy and Buildings, 243:110992, July 2021
work page 2021
-
[11]
Model-agnostic meta-learning for fast adaptation of deep networks
Chelsea Finn, Pieter Abbeel, and Sergey Levine. Model-agnostic meta-learning for fast adaptation of deep networks. InInternational conference on machine learning, pages 1126–1135. PMLR, 2017
work page 2017
-
[12]
Jiajia Gao, Tian Yan, Tao Xu, Ziye Ling, Gongda Wei, and Xinhua Xu. Development and experiment validation of variable-resistance- variable-capacitance dynamic simplified thermal models for shape- stabilized phase change material slab.Applied Thermal Engineering, 146:364–375, 2019
work page 2019
-
[13]
Thomas Gaskin. NeuralABM: Neural parameter calibration for multi- agent models.https://github.com/ThGaskin/NeuralABM. Accessed 2025-06-15
work page 2025
-
[14]
Thomas Gaskin, Grigorios A. Pavliotis, and Mark Girolami. Neural parameter calibration for large-scale multiagent models.Proceedings of the National Academy of Sciences, 120(7), 2023
work page 2023
-
[15]
Hassan Harb, Neven Boyanov, Luis Hernandez, Rita Streblow, and Dirk M ¨uller. Development and validation of grey-box models for forecasting the thermal response of occupied buildings.Energy and Buildings, 117:199–207, 2016
work page 2016
-
[16]
Qie Hu, Frauke Oldewurtel, Maximilian Balandat, Evangelos Vrettos, Datong Zhou, and Claire J. Tomlin. Building model identification during regular operation - empirical results and challenges. In2016 American Control Conference (ACC), pages 605–610, 2016
work page 2016
-
[17]
Model predictive control for energy-efficient buildings: An airport terminal building study
Hao Huang, Lei Chen, and Eric Hu. Model predictive control for energy-efficient buildings: An airport terminal building study. In11th IEEE International Conference on Control & Automation (ICCA), pages 1025–1030, June 2014. ISSN: 1948-3457
work page 2014
-
[18]
Physics-informed machine learning.Nature Reviews Physics, 3(6):422–440, 2021
George Em Karniadakis, Ioannis G Kevrekidis, Lu Lu, Paris Perdikaris, Sifan Wang, and Liu Yang. Physics-informed machine learning.Nature Reviews Physics, 3(6):422–440, 2021
work page 2021
-
[19]
Donghun Kim et al. Hybrid modeling approach for better identification of building thermal network model and improved prediction.arXiv preprint arXiv:2512.05400, 2025
-
[20]
Adam: A Method for Stochastic Optimization
Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization.arXiv preprint arXiv:1412.6980, 2014
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[21]
Costas Kravaris, Juergen Hahn, and Yunfei Chu. Advances and selected recent developments in state and parameter estimation.Com- puters and Chemical Engineering, 51:111–123, 2013. CPC VIII
work page 2013
-
[22]
Thomas Krug, Fabian Raisch, Dominik Aimer, Markus Wirnsberger, Ferdinand Sigg, Felix Koch, Benjamin Sch¨afer, and Benjamin Tischler. A highly configurable framework for large-scale thermal building data generation to drive machine learning research.arXiv preprint arXiv:2512.00483, 2025
-
[23]
Builda: A thermal building data generation framework for transfer learning
Thomas Krug, Fabian Raisch, Dominik Aimer, Markus Wirnsberger, Ferdinand Sigg, Benjamin Sch ¨afer, and Benjamin Tischler. Builda: A thermal building data generation framework for transfer learning. In 2025 Annual Modeling and Simulation Conference (ANNSIM), pages 1–13, 2025
work page 2025
-
[24]
J. Nathan Kutz. Machine learning for parameter estimation.Pro- ceedings of the National Academy of Sciences, 120(12):e2300990120, 2023
work page 2023
-
[25]
Siwei Li, Jaewan Joe, Jianjun Hu, and Panagiota Karava. System identification and model-predictive control of office buildings with integrated photovoltaic-thermal collectors, radiant floor heating and active thermal storage.Solar Energy, 113:139–157, 2015
work page 2015
-
[26]
Typology approach for building stock energy assessment
Tobias Loga, Nikolaus Diefenbach, and Britta Stein. Typology approach for building stock energy assessment. Technical report, Darmstadt, Germany, 2012
work page 2012
-
[27]
Simon P Melgaard, Kamilla H Andersen, Anna Marszal- Pomianowska, Rasmus L Jensen, and Per K Heiselberg. Fault detection and diagnosis encyclopedia for building systems: a systematic review.Energies, 15(12):4366, 2022
work page 2022
-
[28]
Kaicheng Niu, Mi Zhou, Chaouki T Abdallah, and Mohammad Hayajneh. Deep transfer learning for system identification using long short-term memory neural networks.arXiv preprint arXiv:2204.03125, 2022
-
[29]
On modified parameter estimators for identification and adaptive control
Romeo Ortega, Vladimir Nikiforov, and Dmitry Gerasimov. On modified parameter estimators for identification and adaptive control. a unified framework and some new schemes.Annual Reviews in Control, 50:278–293, 2020
work page 2020
-
[30]
Peter Radecki and Brandon Hencey. Online model estimation for predictive thermal control of buildings.IEEE Transactions on Control Systems Technology, 25(4):1414–1422, 2017
work page 2017
-
[31]
Fabian Raisch and Timo Germann. Transfer learning for neural parameter estimation applied to building rc models.https: //github.com/fabianraisch/TransferLearning_ ParameterEstimation. Accessed 2026-03-31
work page 2026
-
[32]
Gentl: A general transfer learning model for building thermal dynamics
Fabian Raisch, Thomas Krug, Christoph Goebel, and Benjamin Tis- chler. Gentl: A general transfer learning model for building thermal dynamics. InProceedings of the 16th ACM International Conference on Future and Sustainable Energy Systems, E-Energy ’25, 2025
work page 2025
-
[33]
Fabian Raisch, Max Langtry, Felix Koch, Ruchi Choudhary, Christoph Goebel, and Benjamin Tischler. Adapting to change: A comparison of continual and transfer learning for modeling building thermal dynamics under concept drifts.Energy and Buildings, 2026
work page 2026
-
[34]
Building energy probabilistic modelling, 2025
Simon Rouchier. Building energy probabilistic modelling, 2025
work page 2025
-
[35]
Skeie, Laurent Georges, Michael D
Igor Sartori, Harald Taxt Walnum, Kristian S. Skeie, Laurent Georges, Michael D. Knudsen, Peder Bacher, Jos ´e Candanedo, Anna-Maria Sigounis, Anand Krishnan Prakash, Marco Pritoni, Jessica Granderson, Shiyu Yang, and Man Pun Wan. Sub-hourly measurement datasets from 6 real buildings: Energy use and indoor climate.Data in Brief, 2023
work page 2023
-
[36]
Rawisha Serasinghe, Nicholas Long, and Jordan D. Clark. Parameter identification methods for low-order gray box building energy models: A critical review.Energy and Buildings, 311:114123, 2024
work page 2024
-
[37]
Inverse problems: a bayesian perspective.Acta numerica, 19:451–559, 2010
Andrew M Stuart. Inverse problems: a bayesian perspective.Acta numerica, 19:451–559, 2010
work page 2010
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.