Real-world and simulated thermal data from 960 residential multi-zone buildings in Central Europe
Pith reviewed 2026-06-28 12:14 UTC · model grok-4.3
The pith
The ThermBuild dataset supplies real measurements from two homes and simulations from 958 buildings to support thermal dynamics modeling for heat pumps.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The ThermBuild dataset comprises real-world measurements from two single-family homes and simulations of 958 TRNSYS building models covering diverse combinations of air-source heat pump systems, numbers of thermal zones, occupancy profiles, building ages, thermal masses, sizes, orientations, window glazings, five European climates, and ventilation configurations. The dataset contains 15-minute-resolution operational data spanning 15 months for the real-world buildings and 3 years for the simulated buildings, with each building time series including detailed measurements of heat pump operation, the heating distribution system, the domestic hot water system, weather conditions, and zone-level
What carries the argument
The ThermBuild dataset of real and TRNSYS-simulated multi-zone residential building time series with heat pump and climate variables at 15-minute resolution.
If this is right
- The data supports development of energy-efficient control strategies for residential heat pump systems.
- It enables fault detection and diagnosis algorithms that operate on zone-level and system-level measurements.
- The mix of real and simulated records facilitates transfer learning and simulation-to-reality model adaptation.
- Researchers can use the collection for benchmarking and generalization testing across climates and building configurations.
- The resource promotes reproducible experiments in thermal modeling of multi-zone buildings.
Where Pith is reading between the lines
- Models trained on this dataset could be tested for robustness by applying them to buildings in climates outside the five European ones included.
- The real-simulated pairing offers a natural testbed for techniques that quantify and reduce the gap between simulation and field performance in energy systems.
- Patterns extracted from the 960-building collection might identify which building parameters most influence heat pump efficiency, guiding targeted retrofits.
- Extending the dataset with newer construction standards or different heating technologies would allow direct comparison of modeling accuracy across eras.
Load-bearing premise
The 958 TRNSYS simulations accurately capture real thermal dynamics and heat pump behavior across the described variations in building age, thermal mass, climate, and ventilation.
What would settle it
Direct comparison of key thermal response metrics such as zone temperature trajectories or heat pump power draw under identical weather inputs between the simulated models and additional real measurements from comparable unmodeled buildings.
Figures
read the original abstract
This paper presents the ThermBuild dataset, which comprises real-world measurements from two single-family homes and simulations of 958 TRNSYS building models. The buildings cover diverse combinations of air-source heat pump systems, numbers of thermal zones, occupancy profiles, building ages, thermal masses, sizes, orientations, window glazings, five European climates, and ventilation configurations. The dataset contains 15-minute-resolution operational data spanning 15 months for the real-world buildings and 3 years for the simulated buildings. Each building time series includes detailed measurements of heat pump operation, the heating distribution system, the domestic hot water system, weather conditions, and zone-level indoor climate variables. The ThermBuild dataset is designed for data-driven thermal dynamics modeling, thereby supporting the deployment of energy-efficient control, as well as fault detection and diagnosis in buildings. It is particularly suited for transfer learning, generalization modeling, benchmarking, simulation-to-reality transfer, and reproducible thermal modeling research.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents the ThermBuild dataset, comprising 15-minute resolution operational data from two real single-family homes (15 months) and 958 TRNSYS-simulated multi-zone residential buildings (3 years). The buildings span diverse air-source heat pump systems, numbers of thermal zones, occupancy profiles, building ages, thermal masses, sizes, orientations, window glazings, five European climates, and ventilation configurations. Each time series includes heat pump operation, heating distribution system, domestic hot water system, weather conditions, and zone-level indoor climate variables. The dataset is positioned to support data-driven thermal dynamics modeling for energy-efficient control, fault detection and diagnosis, transfer learning, generalization, benchmarking, and simulation-to-reality transfer.
Significance. If the TRNSYS simulations faithfully reproduce real thermal dynamics and heat pump behavior, the scale and diversity of the dataset would provide a valuable resource for training robust data-driven models in building energy management, enabling better generalization across Central European residential stock and supporting reproducible research on sim-to-reality transfer. The inclusion of both real and simulated data is a strength for benchmarking purposes.
major comments (2)
- [Abstract] Abstract: The central claim that the dataset supports 'simulation-to-reality transfer' and 'generalization modeling' requires that the 958 TRNSYS models accurately capture real thermal dynamics across the stated parameter ranges; however, no validation against the two measured homes (e.g., quantitative comparison of heat pump COP, zone temperatures, or energy consumption for comparable configurations) is described.
- [Abstract] Dataset description (implied by abstract claims): With ground truth limited to two single-family homes, the absence of error characterization, fidelity metrics, or sensitivity analysis for the simulated buildings undermines the utility for fault detection and data-driven control across varied ages, thermal masses, climates, and ventilation configurations.
minor comments (1)
- [Title] Title states '960 residential multi-zone buildings' while the abstract describes two real homes plus 958 simulated models; the manuscript should explicitly state the total count and whether the real homes are included in the 960.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on the ThermBuild dataset manuscript. We address each major comment below.
read point-by-point responses
-
Referee: [Abstract] Abstract: The central claim that the dataset supports 'simulation-to-reality transfer' and 'generalization modeling' requires that the 958 TRNSYS models accurately capture real thermal dynamics across the stated parameter ranges; however, no validation against the two measured homes (e.g., quantitative comparison of heat pump COP, zone temperatures, or energy consumption for comparable configurations) is described.
Authors: We agree that the manuscript does not include quantitative validation of the TRNSYS models against the two real homes. The paper presents a dataset resource containing both real and simulated data to enable community research on simulation-to-reality transfer and generalization; it does not claim to have performed such validation itself. We will revise the abstract to clarify the dataset's intended use cases without overstating demonstrated fidelity, and add a limitations discussion on the parameterization approach. revision: yes
-
Referee: [Abstract] Dataset description (implied by abstract claims): With ground truth limited to two single-family homes, the absence of error characterization, fidelity metrics, or sensitivity analysis for the simulated buildings undermines the utility for fault detection and data-driven control across varied ages, thermal masses, climates, and ventilation configurations.
Authors: The limited real-world ground truth (two homes) is a factual constraint of the dataset. The 958 simulations were generated using TRNSYS models parameterized from Central European building standards and literature values for the listed diversity factors, but the manuscript does not provide explicit error metrics or sensitivity results. We will incorporate a new subsection on simulation assumptions, available fidelity indicators from the TRNSYS setup, and acknowledged uncertainties to better support the claimed use cases. revision: yes
Circularity Check
No circularity: dataset paper with no derivations or fitted predictions
full rationale
The paper presents a dataset of real measurements from two homes plus TRNSYS simulations for 958 buildings. No equations, parameter fits, predictions, or first-principles derivations are claimed. The central contribution is data release for downstream modeling; the TRNSYS models are described as input generators rather than outputs derived from the paper's own results. No self-citation chains, ansatzes, or renamings reduce any claim to its own inputs. This matches the default expectation of no circularity for a non-derivational paper.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Tracking clean energy progress 2023
IEA. Tracking clean energy progress 2023. Technical report, International Energy Agency, 2023. URL https://www.iea.org/reports/tracking-clean-energy-progress-2023
2023
-
[2]
Zhelun Chen, Zheng O’Neill, Jin Wen, Ojas Pradhan, Tao Yang, Xing Lu, Guanjing Lin, Shohei Miyata, Seungjae Lee, Chou Shen, Roberto Chiosa, Marco Savino Piscitelli, Alfonso Capozzoli, Franz Hengel, Alexander Kührer, Marco Pritoni, Wei Liu, John Clauß, Yimin Chen, and Terry Herr. A review of data- driven fault detection and diagnostics for building HVAC sy...
-
[3]
Ján Drgoňa, Javier Arroyo, Iago Cupeiro Figueroa, David Blum, Krzysztof Arendt, Donghun Kim, En- ric Perarnau Ollé, Juraj Oravec, Michael Wetter, Draguna L. Vrabie, and Lieve Helsen. All you need to know about model predictive control for buildings.Annual Reviews in Control, 50:190–232, 2020. ISSN 13675788. doi: 10.1016/j.arcontrol.2020.09.001
-
[4]
Zoltan Nagy, Gregor Henze, Sourav Dey, Javier Arroyo, Lieve Helsen, Xiangyu Zhang, Bingqing Chen, Kadir Amasyali, Kuldeep Kurte, Ahmed Zamzam, Helia Zandi, Ján Drgoňa, Matias Quintana, Steven McCullogh, June Young Park, Han Li, Tianzhen Hong, Silvio Brandi, Giuseppe Pinto, Alfonso Capozzoli, Draguna Vrabie, Mario Bergés, Kingsley Nweye, Thibault Marzullo,...
-
[5]
Gentl: A general transfer learning model for building thermal dynamics
Fabian Raisch, Thomas Krug, Christoph Goebel, and Benjamin Tischler. Gentl: A general transfer learning model for building thermal dynamics. InProceedings of the 16th ACM International Conference on Future and Sustainable Energy Systems, E-Energy ’25, page 322–333, New York, NY, USA, 2025. Association for Computing Machinery. ISBN 9798400711251. doi: 10.1...
-
[6]
Fabian Raisch, Max Langtry, Felix Koch, Ruchi Choudhary, Christoph Goebel, and Benjamin Tischler. Adapting to change: A comparison of continual and transfer learning for modeling building thermal dynamics under concept drifts.Energy and Buildings, 354:116868, 2026. ISSN 0378-7788. doi: https: //doi.org/10.1016/j.enbuild.2025.116868. URL https://www.scienc...
-
[7]
Thermal-gems: Generalized models for building thermal dynamics, 2026
Felix Koch, Fabian Raisch, and Benjamin Tischler. Thermal-gems: Generalized models for building thermal dynamics, 2026. URLhttps://arxiv.org/abs/2604.16443
Pith/arXiv arXiv 2026
-
[8]
Building thermal dynamics modeling with deep transfer learning using a large residential smart thermostat dataset
Han Li, Giuseppe Pinto, Marco Savino Piscitelli, Alfonso Capozzoli, and Tianzhen Hong. Building thermal dynamics modeling with deep transfer learning using a large residential smart thermostat dataset. Engineering Applications of Artificial Intelligence, 130:107701, 2024
2024
-
[9]
Hongwen Dou and Kun Zhang. Transfer learning for cross-building forecasting of building energy and indoor air temperature in model predictive control applications.Journal of Building Engineering, page 113341, 2025
2025
-
[10]
Counter- dyna: Data-efficient rl-based hvac control using counterfactual building models, 2026
Jan Marco Ruiz de Vargas, Fabian Raisch, Zoltan Nagy, Pierre Pinson, and Christoph Goebel. Counter- dyna: Data-efficient rl-based hvac control using counterfactual building models, 2026. URL https: //arxiv.org/abs/2605.04555
Pith/arXiv arXiv 2026
-
[11]
Nathan Kutz, Christoph Goebel, and Benjamin Tischler
Fabian Raisch, Timo Germann, J. Nathan Kutz, Christoph Goebel, and Benjamin Tischler. Transfer learning for neural parameter estimation applied to building rc models, 2026. URLhttps://arxiv.org/ abs/2604.05904
Pith/arXiv arXiv 2026
-
[12]
Sharing is caring: An extensive analysis of parameter-based transfer learning for the prediction of building thermal dynamics.Energy and Buildings, 276:112530, 2022
Giuseppe Pinto, Riccardo Messina, Han Li, Tianzhen Hong, Marco Savino Piscitelli, and Alfonso Capozzoli. Sharing is caring: An extensive analysis of parameter-based transfer learning for the prediction of building thermal dynamics.Energy and Buildings, 276:112530, 2022
2022
-
[13]
Whole model empirical validation on a full-scale building, 2016
Paul Strachan, Katalin Svehla, Ingo Heusler, and Matthias Kersken. Whole model empirical validation on a full-scale building, 2016. URLhttps://publica.fraunhofer.de/handle/publica/241872
2016
-
[14]
Uncertainty of the predictions of different programs and modelling teams based on a detailed empirical validation dataset, 2025
Matthias Kersken, Gabriel Rojas, and Paul Strachan. Uncertainty of the predictions of different programs and modelling teams based on a detailed empirical validation dataset, 2025. URLhttps://publica. fraunhofer.de/handle/publica/479714. 15
2025
-
[15]
The building data genome project 2, energy meter data from the ashrae great energy predictor iii competition.Scientific data, 7(1): 368, 2020
Clayton Miller, Anjukan Kathirgamanathan, Bianca Picchetti, Pandarasamy Arjunan, June Young Park, Zoltan Nagy, Paul Raftery, Brodie W Hobson, Zixiao Shi, and Forrest Meggers. The building data genome project 2, energy meter data from the ashrae great energy predictor iii competition.Scientific data, 7(1): 368, 2020
2020
-
[16]
Buildingsbench: A large-scale dataset of 900k buildings and benchmark for short-term load forecasting.Advances in Neural Information Processing Systems, 36: 19823–19857, 2023
Patrick Emami, Abhijeet Sahu, and Peter Graf. Buildingsbench: A large-scale dataset of 900k buildings and benchmark for short-term load forecasting.Advances in Neural Information Processing Systems, 36: 19823–19857, 2023
2023
-
[17]
Ecobee donate your data 1,000 homes in 2017
Na Luo and Tianzhen Hong. Ecobee donate your data 1,000 homes in 2017. Technical report, Pacific Northwest National Lab.(PNNL), Richland, WA (United States), 2022
2017
-
[18]
A hot dataset: 150,000 buildings for hvac operations transfer research
Anaïs Berkes, Yoshua Bengio, David Rolnick, and Donna Vakalis. A hot dataset: 150,000 buildings for hvac operations transfer research. InProceedings of the 12th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation, pages 171–180, 2025
2025
-
[19]
The ideal household energy dataset, electricity, gas, contextual sensor data and survey data for 255 uk homes.Scientific Data, 8(1):146, 2021
Martin Pullinger, Jonathan Kilgour, Nigel Goddard, Niklas Berliner, Lynda Webb, Myroslava Dzikovska, Heather Lovell, Janek Mann, Charles Sutton, Janette Webb, et al. The ideal household energy dataset, electricity, gas, contextual sensor data and survey data for 255 uk homes.Scientific Data, 8(1):146, 2021
2021
-
[20]
Hongyuan Jia and Adrian Chong. eplusr: A framework for integrating building energy simulation and data- driven analytics.Energy and Buildings, Vol 237, 2021. doi: https://doi.org/10.1016/j.enbuild.2021.110757
-
[21]
Synconn_build: A python based synthetic dataset generator for testing and validating control-oriented neural networks for building dynamics prediction.MethodsX, 11:102464, 2023
Gaurav Chaudhary, Hicham Johra, Laurent Georges, and Bjørn Austbø. Synconn_build: A python based synthetic dataset generator for testing and validating control-oriented neural networks for building dynamics prediction.MethodsX, 11:102464, 2023
2023
-
[22]
Builda: A thermal building data generation framework for transfer learning
Thomas Krug, Fabian Raisch, Dominik Aimer, Markus Wirnsberger, Ferdinand Sigg, Benjamin Schäfer, and Benjamin Tischler. Builda: A thermal building data generation framework for transfer learning. In 2025 Annual Modeling and Simulation Conference (ANNSIM), pages 1–13, 2025
2025
-
[23]
Thomas Krug, Fabian Raisch, Dominik Aimer, Markus Wirnsberger, Ferdinand Sigg, Felix Koch, Benjamin Schäfer, and Benjamin Tischler. A highly configurable framework for large-scale thermal building data generation to drive machine learning research, 2025. URLhttps://arxiv.org/abs/2512.00483
arXiv 2025
-
[24]
Buildyn: Excitation- driven data generation for building thermal dynamics modeling and control, 2026
Felix Koch, Thomas Krug, Fabian Raisch, Benjamin Schäfer, and Benjamin Tischler. Buildyn: Excitation- driven data generation for building thermal dynamics modeling and control, 2026. URLhttps://arxiv. org/abs/2605.29849
Pith/arXiv arXiv 2026
-
[25]
S. A. Klein, W. A. Beckman, J. W. Mitchell, J. A. Duffie, T. L. Freeman, J. C. Mitchell, J. E. Braun, B. L. Evans, J. P. Kummer, R. E. Urban, A. Fiksel, J. W. Thornton, N. J. Blair, J. A. Beckman, and S. J. Klein. TRNSYS 18: A Transient System Simulation Program. Solar Energy Laboratory, University of Wisconsin, Madison, USA, 2017. URLhttp://sel.me.wisc.e...
2017
-
[26]
Gebäudeenergiegesetz (GEG) – build- ings energy act
German Federal Ministry for Economic Affairs and Energy. Gebäudeenergiegesetz (GEG) – build- ings energy act. Bundesgesetzblatt (BGBl.) I, 2020. URL https://geg-info.de/geg/2020.08.13. _bundesgesetzblatt_geg_2020_verkundung.pdf
2020
-
[27]
G. H. Flett and N. Kelly. Occdem - a program to generate statistically-based occupancy and occupant-driven electrical demand profiles. Computer software (OccDem_0(2.zip), March 2021. Creator: G. H. Flett; Contributor: N. Kelly
2021
-
[28]
Randall Thomas.Environmental Design. Taylor & Francis, 2006. ISBN 9780415363341. doi: https: //doi.org/10.4324/9780203013663
-
[29]
Typology approach for building stock energy assessment
Tobias Loga, Nikolaus Diefenbach, and Britta Stein. Typology approach for building stock energy assessment. Technical report, Institut Wohnen und Umwelt, Darmstadt, Germany, 2012
2012
-
[30]
Regeln zur Datenaufnahme und Datenverwendung im Wohngebäudebestand
German Federal Institute for Research on Building, Urban Affairs and Spatial Development (BBSR). Regeln zur Datenaufnahme und Datenverwendung im Wohngebäudebestand. Techni- cal report, 2023. URL https://www.bbsr-geg.bund.de/GEGPortal/DE/ErgaenzendendeRegelungen/ Bekanntmachungen/Bestandsberechnungen/Download/WGDatenaufnahmeGEG_DL.pdf. Supplementary Regula...
2023
-
[31]
Trnflow – a module for coupled multizone airflow and thermal simulation in trnsys
Viktor Dorer and Andreas Weber. Trnflow – a module for coupled multizone airflow and thermal simulation in trnsys. Technical report, EMPA, Swiss Federal Laboratories for Materials Testing and Research, Dübendorf, Switzerland, 2009. Version 1.4. 16
2009
-
[32]
Feustel and Johannes Dieris
Helmut E. Feustel and Johannes Dieris. Comis – an international multizone air-flow and contaminant transport model. Technical Report LBL-28560, Lawrence Berkeley National Laboratory, Berkeley, CA, USA, 1992
1992
-
[33]
Design and realisation of the passive house concept in different climate zones.Energy Efficiency, 13(8):1561–1604, 2020
Jürgen Schnieders, Tim Delhey Eian, Marco Filippi, Javier Florez, Berthold Kaufmann, Stefanos Pallantzas, Monte Paulsen, Elena Reyes, Micheel Wassouf, and Shih-Chieh Yeh. Design and realisation of the passive house concept in different climate zones.Energy Efficiency, 13(8):1561–1604, 2020
2020
-
[34]
Meteotest, Bern, Switzerland, 2020
Meteotest.Meteonorm 8: Global meteorological database for solar energy and climatology. Meteotest, Bern, Switzerland, 2020. URLhttps://meteonorm.com. Computer software
2020
-
[35]
Matthias Kersken, Fabian Raisch, Markus Male, and Benjamin Tischler. ThermBuild: Real-world and simulated thermal data from 960 residential multi-zone buildings in Central Europe, 2026. URLhttps: //fordatis.fraunhofer.de/handle/fordatis/486. http://dx.doi.org/10.24406/fordatis/445
-
[36]
Validation practices for simulation-based research.Energy and Buildings, 352:116853, 2026
Zhiqiang John Zhai, Tianzhen Hong, Matthaios Santamouris, and Jian-Lei Niu. Validation practices for simulation-based research.Energy and Buildings, 352:116853, 2026. ISSN 0378-7788. doi: https: //doi.org/10.1016/j.enbuild.2025.116853. URL https://www.sciencedirect.com/science/article/pii/ S037877882501583X
-
[37]
ASHRAE, Atlanta, Georgia, 2014
American Society of Heating, Refrigerating and Air Conditioning Engineers.ASHRAE Guideline 14- 2014: Measurement of Energy, Demand, and Water Savings. ASHRAE, Atlanta, Georgia, 2014. URL https://www.ashrae.org. Includes guidelines for measurement and verification of energy, demand, and water savings in buildings projects. 17
2014
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.