pith. sign in

arxiv: 2606.22911 · v1 · pith:VQRKRYRInew · submitted 2026-06-22 · 💻 cs.AI · cs.LG· cs.SY· eess.SY

ThermoLLM: Thermodynamics-Aware HVAC Control with Spatial-Semantic Knowledge Graph

Pith reviewed 2026-06-26 08:35 UTC · model grok-4.3

classification 💻 cs.AI cs.LGcs.SYeess.SY
keywords HVAC controllarge language modelsknowledge graphsbuilding energy managementthermal comfortmulti-zone systemsspatial semanticsEnergyPlus simulation
0
0 comments X

The pith

A spatial knowledge graph lets an LLM HVAC controller reason about zone couplings to achieve the best energy-comfort trade-off.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper develops a control method for multi-zone heating and cooling systems that combines a large language model with a structured representation of the building. The representation encodes physical adjacencies, thermal connections between zones, and recent history of temperature and control actions. At each step the model receives the current sensor readings, the graph of spatial relationships, and the short-term trajectory, allowing it to select actions that account for how heat moves across the building over time. In a five-zone EnergyPlus simulation the resulting controller records the strongest overall balance between energy consumption and occupant comfort while producing the fewest violations of the predicted mean vote comfort index. If the reported gains hold, the method offers a way to improve building control without requiring hand-crafted dynamic models of every thermal interaction.

Core claim

The central claim is that supplying an LLM with a physics-informed spatial knowledge graph derived from Brick-style building semantics and linked to recent interaction history enables it to generate HVAC control actions that respect zone layout, adjacency, and delayed thermal interactions, yielding the best energy-comfort trade-off and the lowest PMV violation among tested baselines in a five-zone simulation.

What carries the argument

A physics-informed spatial knowledge graph that encodes Brick-style building semantics and recent interaction history to supply structured context to the LLM at each control step.

If this is right

  • The controller records the best overall energy-comfort trade-off.
  • It produces the lowest PMV violation count while remaining energy efficient.
  • It outperforms both conventional control baselines and other LLM-based methods that lack the spatial graph.
  • Decisions at each step incorporate zone adjacency and short-term thermal evolution.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same graph-augmented prompting pattern could be tested on buildings with different zone counts or real sensor streams to check whether the performance edge persists outside the simulated five-zone case.
  • If the LLM learns thermal couplings from the graph, the method might reduce reliance on explicit differential-equation models of building dynamics.
  • The approach could be combined with reinforcement-learning fine-tuning of the LLM to further improve long-horizon consistency across seasons.

Load-bearing premise

The LLM can reliably translate graph-structured spatial context and short-term history into control actions that reflect actual thermal coupling and delayed interactions better than simpler prompt baselines.

What would settle it

A head-to-head run of the five-zone EnergyPlus simulation in which the identical LLM receives either the full graph-augmented prompt or an unstructured text prompt and the resulting energy use and PMV violation counts are compared.

Figures

Figures reproduced from arXiv: 2606.22911 by Flora D. Salim, Kirtan Bhatt, Matthew Amos, Wen Hu, Xiachong Lin.

Figure 1
Figure 1. Figure 1: ThermoLLM: Spatial-Semantic LLM-based HVAC control framework [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Sinergym: EnergyPlus-based simulation environ [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Prompt template for our framework preservation, while for unoccupied periods, we set 𝜌𝑖 = 0.1 to allow energy-saving setbacks. Since the original control problem has continuous observations and continuous heating/cooling setpoint actions, the Q-learning baseline discretizes both the observation and action spaces before training. PPO is trained under the same environment, reward defi￾nition, and seasonal sp… view at source ↗
Figure 4
Figure 4. Figure 4: Pareto front trade-off visualization [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
read the original abstract

Multi-zone HVAC control is a spatial decision problem in which indoor thermal evolution and control decisions depend not only on outdoor conditions and internal heat gains but also on zone layout, physical adjacency, and delayed thermal interactions across the building. Recent LLM-based HVAC controllers have shown that prompt-based control is feasible. However, these methods typically rely on task descriptions, observation values, short textual feedback, or unstructured retrieval, which limits their ability to reason about zone coupling, thermal response, and building dynamics. This paper presents a thermodynamics-aware LLM control framework for a five-zone EnergyPlus building simulation. The controller is grounded in a physics-informed spatial knowledge graph derived from Brick-style building semantics and linked with recent interaction history. At each control step, the model receives the current building state, graph-structured spatial context, and recent environment-controller history, enabling it to make decisions that reflect both building structure and short-term thermal evolution. We evaluate the framework against standard control baselines and several LLM-based alternatives. Results show that the proposed approach achieves the best overall energy-comfort trade-off and the lowest PMV violation while maintaining energy-efficient operation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper introduces ThermoLLM, an LLM-based HVAC controller for a five-zone EnergyPlus building that augments the model with a physics-informed spatial knowledge graph derived from Brick-style building semantics and linked to recent controller-environment history. At each step the LLM receives current state, graph-structured spatial context, and short-term history to produce control actions that purportedly reflect zone adjacency and delayed thermal coupling. The central claim is that this yields the best energy-comfort trade-off and the lowest PMV violation among standard baselines and other LLM variants while remaining energy-efficient.

Significance. If the performance advantage is shown to stem specifically from the structured spatial graph rather than from history or prompt engineering, the work would provide concrete evidence that explicit modeling of building topology improves LLM reasoning about thermal dynamics. The use of Brick semantics is a constructive step toward reproducible, physics-grounded representations in building control.

major comments (3)
  1. [Evaluation] Evaluation section: the headline result (best energy-comfort trade-off and lowest PMV violation) is attributed to the inclusion of the Brick-derived spatial knowledge graph, yet no ablation is reported that removes or replaces the graph with an unstructured zone list while holding history, prompt format, and base LLM fixed. This isolation is required to substantiate the claim that graph-structured spatial context is the load-bearing factor.
  2. [Abstract] Abstract and Methods: the abstract asserts superior performance on energy-comfort and PMV metrics but supplies no numerical values, baseline definitions, statistical tests, or ablation tables; without these the central empirical claim cannot be assessed.
  3. [§3] §3 (or equivalent methods description): the construction of the physics-informed spatial knowledge graph and the precise mechanism by which it is serialized into the LLM prompt are not detailed enough to allow reproduction or to verify that the graph actually encodes adjacency and delayed thermal interactions beyond what unstructured text would provide.
minor comments (2)
  1. Clarify the exact prompt template and how graph nodes/edges are represented in text so that readers can judge the degree of structure actually supplied to the LLM.
  2. Provide the full set of baseline controllers with identical hyper-parameter reporting as the proposed method.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments that identify key areas for strengthening the empirical claims and reproducibility of the manuscript. We address each major point below and will incorporate revisions to improve the paper.

read point-by-point responses
  1. Referee: [Evaluation] Evaluation section: the headline result (best energy-comfort trade-off and lowest PMV violation) is attributed to the inclusion of the Brick-derived spatial knowledge graph, yet no ablation is reported that removes or replaces the graph with an unstructured zone list while holding history, prompt format, and base LLM fixed. This isolation is required to substantiate the claim that graph-structured spatial context is the load-bearing factor.

    Authors: We agree that the current manuscript lacks this specific ablation. To substantiate the claim that the graph structure is the key factor, we will add an ablation study in the revised evaluation section that replaces the spatial knowledge graph with an unstructured zone list while holding history, prompt format, and base LLM fixed. revision: yes

  2. Referee: [Abstract] Abstract and Methods: the abstract asserts superior performance on energy-comfort and PMV metrics but supplies no numerical values, baseline definitions, statistical tests, or ablation tables; without these the central empirical claim cannot be assessed.

    Authors: We will revise the abstract to include specific numerical values for energy-comfort trade-off and PMV metrics, explicitly define the baselines, and reference the statistical tests and ablation tables to allow proper assessment of the claims. revision: yes

  3. Referee: [§3] §3 (or equivalent methods description): the construction of the physics-informed spatial knowledge graph and the precise mechanism by which it is serialized into the LLM prompt are not detailed enough to allow reproduction or to verify that the graph actually encodes adjacency and delayed thermal interactions beyond what unstructured text would provide.

    Authors: We will expand the methods section (or equivalent) with a detailed description of the knowledge graph construction from Brick semantics, how it encodes adjacency and delayed thermal interactions, and the precise serialization mechanism into the LLM prompt, including examples to support reproducibility. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical framework with no derivations or fitted predictions

full rationale

The paper describes an LLM-based HVAC controller that ingests a Brick-derived spatial knowledge graph plus history, then evaluates it via simulation against baselines. No equations, first-principles derivations, parameter fitting, or predictions that reduce to inputs by construction appear in the provided text. The performance claim is an empirical comparison result, not a mathematical reduction. Self-citations, if present, are not load-bearing for any derivation. The derivation chain is therefore self-contained and non-circular.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

Abstract-only review; ledger entries are inferred from stated components rather than explicit paper content. The central addition is the knowledge graph whose utility depends on unstated assumptions about LLM reasoning capacity.

axioms (1)
  • domain assumption LLM can effectively reason over graph-structured building semantics and short-term thermal history to produce control actions
    The framework's advantage is predicated on this capability; no evidence or mechanism is supplied in the abstract.
invented entities (1)
  • Physics-informed spatial knowledge graph no independent evidence
    purpose: To supply zone layout, adjacency, and interaction history to the LLM controller
    Introduced as the grounding mechanism for the controller; no independent validation or external data source is mentioned.

pith-pipeline@v0.9.1-grok · 5744 in / 1321 out tokens · 28779 ms · 2026-06-26T08:35:55.093253+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

21 extracted references · 17 canonical work pages

  1. [1]

    Ahn, D.-W

    K. Ahn, D.-W. Kim, H. M. Cho, and C.-U. Chae. 2023. Alternative Approaches to HVAC Control of Chat Generative Pre-Trained Transformer (ChatGPT) for Autonomous Building System Operations.Buildings(Oct. 2023). doi:10.3390/ buildings13112680

  2. [2]

    T. An, Y. Zhou, H. Zou, and J. Yang. 2025. IoT-LLM: A framework for enhancing large language model reasoning from real-world sensor data.Patterns7 (2025). doi:10.1016/j.patter.2025.101429

  3. [3]

    Balaji et al

    B. Balaji et al. 2016. Brick: Towards a Unified Metadata Schema For Buildings. In Proceedings of the 3rd ACM International Conference on Systems for Energy-Efficient Built Environments. doi:10.1145/2993422.2993577

  4. [6]

    X. Ding, A. Cerpa, and W. Du. 2024. Multi-Zone HVAC Control With Model- Based Deep Reinforcement Learning.IEEE Transactions on Automation Science and Engineering22 (2024), 4408–4426. doi:10.1109/TASE.2024.3410951

  5. [7]

    Drgoňa et al

    J. Drgoňa et al. [n. d.]. All you need to know about Model Predictive Control for buildings.Annual Reviews in Control50 ([n. d.]), 190–232

  6. [8]

    Stability of entropic optimal transport and Schrödinger bridges.J

    J. Drgoňa et al. 2020. All you need to know about model predictive control for buildings.Annual Reviews in Control50 (Sept. 2020), 190–232. doi:10.1016/J. ARCONTROL.2020.09.001

  7. [9]

    Ko and R

    Y.-D. Ko and R. K. Jain. 2025. DARLIN: Domain-guided Augmented Retrieval for LLM-based INterpretable HVAC Control. InProceedings of the 12th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation. doi:10.1145/3736425.3772322

  8. [10]

    S. Ma, C. Xu, X. Jiang, M. Li, H. Qu, and J. Guo. 2024. Think-on-Graph 2.0: Deep and Interpretable Large Language Model Reasoning with Knowledge Graph- guided Retrieval. arXiv:2407.10805 [cs.AI] doi:10.48550/arXiv.2407.10805

  9. [12]

    Maddalena, Y

    E. Maddalena, Y. Lian, and C. Jones. 2020. Data-driven methods for building control – A review and promising future directions.Control Engineering Practice (Feb. 2020). doi:10.1016/j.conengprac.2019.104211

  10. [13]

    Manjavacas, A

    A. Manjavacas, A. Campoy-Nieves, J. J. Raboso, M. Molina-Solana, and J. Gómez- Romero. 2024. An experimental evaluation of deep reinforcement learning algorithms for HVAC control.Artificial Intelligence Review57 (Jan. 2024). doi:10. 1007/s10462-024-10819-x

  11. [14]

    Ouyang and M

    X. Ouyang and M. Srivastava. 2024. LLMSense: Harnessing LLMs for High-level Reasoning Over Spatiotemporal Sensor Traces. In2024 IEEE 3rd Workshop on Machine Learning on Edge in Sensor Systems (SenSys-ML). 9–14. doi:10.1109/ SenSys-ML62579.2024.00007

  12. [15]

    Sawada, M

    T. Sawada, M. Mizuno, T. Hasegawa, K. Yokoyama, and M. Kono. 2025. Office-in- the-Loop: an investigation into Agentic AI for advanced building HVAC control systems.Data-Centric Engineering(2025). doi:10.1017/dce.2025.10010

  13. [17]

    Serale, M

    G. Serale, M. Fiorentini, A. Capozzoli, D. Bernardini, and A. Bemporad. 2018. Model Predictive Control (MPC) for Enhancing Building and HVAC System En- ergy Efficiency: Problem Formulation, Applications and Opportunities.Energies (March 2018). doi:10.3390/EN11030631

  14. [18]

    Sharma, P

    K. Sharma, P. Kumar, and Y. Li. 2024. OG-RAG: Ontology-Grounded Retrieval- Augmented Generation For Large Language Models. arXiv:2412.15235 [cs.AI] doi:10.48550/arXiv.2412.15235

  15. [19]

    L. Song, C. Zhang, L. Zhao, and J. Bian. 2023. Pre-Trained Large Language Models for Industrial Control. arXiv:2308.03028 [cs.AI] doi:10.48550/arXiv.2308.03028

  16. [20]

    Zeynep Duygu Tekler, E. Ono, Y. Peng, B. Cham, and A. Schlueter. 2022. ROBOD, room-level occupancy and building operation dataset.Building Simulation15, 12 (2022), 2127–2137. doi:10.1007/s12273-022-0925-9

  17. [21]

    Teymourzadeh and Y

    R. Teymourzadeh and Y. Nakazawa. 2025. A Scalable and Interoperable Platform for Transforming Building Information with Brick Ontology. arXiv:2509.16259 doi:10.63044/s25asc31

  18. [22]

    Wei, Jason Wei, Chris Tar, Yun- Hsuan Sung, Denny Zhou, Quoc V

    Y. Tian et al. 2024. Augmenting Reasoning Capabilities of LLMs with Graph Structures in Knowledge Base Question Answering. InFindings of the Association for Computational Linguistics: EMNLP 2024. 11967–11977. doi:10.18653/v1/2024. findings-emnlp.699

  19. [23]

    Marshall Wang, John Willes, Thomas Jiralerspong, and Matin Moezzi. 2023. A Comparison of Classical and Deep Reinforcement Learning Methods for HVAC Control. In2023 IEEE Smart World Congress (SWC). 1–7. doi:10.1109/SWC57546. 2023.10448598

  20. [24]

    and Duck, P.Stochastic wind speed modelling for estimation of expected wind power output

    Z. Wang and T. Hong. 2020. Reinforcement learning for building controls: The opportunities and challenges.Applied Energy(July 2020). doi:10.1016/j.apenergy. 2020.115036

  21. [25]

    Zhang and P

    Y. Zhang and P. Xu. 2026. A large language model-based framework with retrieval- augmented generation for automated building energy modeling. InE3S Web of Conferences. doi:10.1051/e3sconf/202668908004