pith. sign in

arxiv: 2509.09794 · v4 · submitted 2025-09-11 · 💻 cs.AI · cs.LG

Synthetic Homes: A Multimodal Generative AI Pipeline for Residential Building Data Generation under Data Scarcity

Pith reviewed 2026-05-18 17:06 UTC · model grok-4.3

classification 💻 cs.AI cs.LG
keywords synthetic data generationgenerative AIresidential buildingsenergy modelingdata scarcitymultimodal AIbuilding parametersurban simulation
0
0 comments X

The pith

A multimodal generative AI pipeline creates synthetic residential building datasets from public records and images that overlap more than 65 percent with real reference data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a modular framework that combines vision-language models for processing building images, language models for generating tabular parameters, and simulation tools to produce complete synthetic home datasets. It starts from publicly available county records and images rather than private or expensive sources. The authors evaluate the output against a national reference dataset and report substantial distributional overlap. This setup targets the data scarcity problem that limits energy modeling, retrofit analysis, and urban-scale simulations at the building level.

Core claim

The authors claim that an end-to-end multimodal generative AI pipeline, integrating image, tabular, and simulation-based components and trained or prompted on public county records and images, produces synthetic residential building parameter sets whose distributions overlap the reference national dataset by more than 65 percent across all evaluated parameters and by more than 90 percent for three of the four parameters.

What carries the argument

The modular multimodal generative AI framework that fuses vision-language image analysis, tabular data synthesis, and simulation components to convert public county records and images into synthetic building parameter vectors.

If this is right

  • Energy modeling and retrofit analysis can proceed without access to restricted private building records.
  • Urban-scale simulations become feasible in regions where detailed building stock data is scarce or costly.
  • Machine-learning tasks that rely on large building datasets can scale using only public inputs.
  • The same pipeline can be reused to refresh or expand datasets as new public records become available.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar pipelines could be adapted for commercial buildings or non-residential stock if analogous public imagery and records exist.
  • The occlusion-based visual focus test used to compare vision models could serve as a general diagnostic for other image-to-parameter extraction tasks in architecture.
  • If the overlap holds under broader validation, the method could shorten the lead time for city-level energy policy studies that currently wait for new survey data.

Load-bearing premise

The generative components produce parameter distributions that represent actual residential buildings rather than artifacts introduced by the model architecture or its training data.

What would settle it

Collect real building parameter records from a new set of counties not used in training or prompting, generate matching synthetics with the same pipeline, and test whether the reported overlap percentages remain stable or drop sharply on any key parameter.

Figures

Figures reproduced from arXiv: 2509.09794 by Chetan Tiwari, Jackson Eshbaugh, Jorge Silveyra.

Figure 1
Figure 1. Figure 1: Overview of our proposed pipeline. 5 [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: An example image and floor plan from the Northampton County database. [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: We use GPT for structured data generation given its strong ability to produce well-formed JSON outputs from descriptive prompts. Multiple studies have shown that LLMs perform well on generation tasks under strict schema constraints, even with no previous prompting or fine tuning. For example, StructuredRAG’s benchmark suite showed 82% format compliance under JSON prompt conditions [40]. As we built the pip… view at source ↗
Figure 3
Figure 3. Figure 3: Excerpt of a generated GeoJSON file. 2.4. Running EnergyPlus Simulations To execute a simulation in EnergyPlus, we take the GeoJSON generated by GPT (Section 2.3) and convert it to an IDF file for use in EnergyPlus. We use a template IDF file, filled with default values that are replaced with variables from the GeoJSON data, such as the geometry, HVAC heating and cooling coefficients of performance, r-valu… view at source ↗
Figure 4
Figure 4. Figure 4: GPT and LLaVA forward occlusion per image results. [PITH_FULL_IMAGE:figures/full_fig_p014_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Forward occlusion statistics for GPT and LLaVA. [PITH_FULL_IMAGE:figures/full_fig_p014_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: GPT and LLaVA occlusion results. of the image, primarily consisting of the home’s roof. GPT, on the other hand, behaves much more randomly. Based on these results, we selected LLaVA for use in the image processing step of our pipeline. Full experimental details and findings are available in Appendix B. 3.2. Validation of Labeling Components As discussed in Section 2.5, the pipeline includes a labeler that … view at source ↗
Figure 7
Figure 7. Figure 7: Ablation and combined variation testing of the labeling module using only GPT. [PITH_FULL_IMAGE:figures/full_fig_p017_7.png] view at source ↗
Figure 7
Figure 7. Figure 7: Ablation and combined variation testing of the labeling module using only GPT [PITH_FULL_IMAGE:figures/full_fig_p018_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Experimental results after introducing the heuristic labeler. [PITH_FULL_IMAGE:figures/full_fig_p019_8.png] view at source ↗
Figure 8
Figure 8. Figure 8: Experimental results after introducing the heuristic labeler (continued). [PITH_FULL_IMAGE:figures/full_fig_p020_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Experimental results after introducing the weighted sum. [PITH_FULL_IMAGE:figures/full_fig_p021_9.png] view at source ↗
Figure 9
Figure 9. Figure 9: Experimental results after introducing the weighted sum (continued). [PITH_FULL_IMAGE:figures/full_fig_p022_9.png] view at source ↗
read the original abstract

Computational models have emerged as powerful tools for multi-scale energy modeling research at the building and urban scale, supporting data-driven analysis across building and urban energy systems. However, these models require large amounts of building parameter data that is often inaccessible, expensive to collect, or subject to privacy constraints. We introduce a modular, multimodal generative Artificial Intelligence (AI) framework that integrates image, tabular, and simulation-based components and produces synthetic residential building datasets from publicly available county records and images, and present an end-to-end pipeline instantiating this framework. To reduce typical Large Language Model (LLM) challenges, we evaluate our model's components using occlusion-based visual focus analysis. Our analysis demonstrates that our selected vision-language model achieves significantly stronger visual focus than a GPT-based alternative for building image processing. We also assess realism of our results against a national reference dataset. Our synthetic data overlaps more than 65% with the reference dataset across all evaluated parameters and greater than 90% for three of the four. This work reduces dependence on costly or restricted data sources, lowering barriers to building-scale energy research and Machine Learning (ML)-driven urban energy modeling, and therefore enabling scalable downstream tasks such as energy modeling, retrofit analysis, and urban-scale simulation under data scarcity.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces a modular multimodal generative AI framework integrating image, tabular, and simulation components to generate synthetic residential building datasets from public county records and images. It presents an end-to-end pipeline, evaluates the vision-language model via occlusion-based visual focus analysis (showing superiority over GPT alternatives), and assesses realism through overlap with a national reference dataset (>65% across all parameters, >90% for three of four).

Significance. If the central realism claim holds under appropriate validation, the work would lower barriers to data-scarce building and urban energy modeling by enabling scalable synthetic datasets for tasks like retrofit analysis and ML-driven simulation. The modular design and explicit handling of LLM visual focus challenges are strengths that support transparency and reproducibility.

major comments (2)
  1. [Results (realism assessment)] The realism assessment compares synthetic outputs to a national reference dataset rather than the source county records and images used for generation. Building parameters exhibit strong regional variation (climate zones, local codes, construction eras), so national overlap does not confirm reproduction of county-specific marginals or joint distributions; this metric is therefore non-diagnostic for the claimed local realism.
  2. [Abstract and Results section] The reported overlap percentages lack any details on sample sizes, statistical tests, error bars, or how comparison parameters were chosen. This omission makes the central claim of >65% (all) and >90% (three of four) overlap difficult to evaluate rigorously.
minor comments (1)
  1. [Methods] The description of the occlusion-based visual focus analysis could include a brief definition or reference on first use to aid readers unfamiliar with the technique.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed feedback, which has helped us identify areas for improvement in the presentation and validation of our results. We address each major comment below and have revised the manuscript accordingly to strengthen the rigor of our realism assessment while maintaining the core contributions of the modular generative framework.

read point-by-point responses
  1. Referee: [Results (realism assessment)] The realism assessment compares synthetic outputs to a national reference dataset rather than the source county records and images used for generation. Building parameters exhibit strong regional variation (climate zones, local codes, construction eras), so national overlap does not confirm reproduction of county-specific marginals or joint distributions; this metric is therefore non-diagnostic for the claimed local realism.

    Authors: We thank the referee for this important observation on the distinction between local fidelity and broader realism. Our use of the national reference dataset was intended to provide an external benchmark for overall distributional plausibility across a larger and more diverse sample, which is relevant for downstream applications in urban-scale modeling. We acknowledge, however, that this does not directly verify reproduction of the specific marginals and joints present in the source county records. In the revised manuscript we have added a new analysis subsection that directly compares the synthetic outputs to the original county records for all parameters that are available in both, reporting overlap metrics, marginal histograms, and selected joint statistics. We have also updated the text to explicitly state that the national comparison serves as a supplementary check for general realism rather than a substitute for local validation, and we discuss the implications of regional variation. revision: yes

  2. Referee: [Abstract and Results section] The reported overlap percentages lack any details on sample sizes, statistical tests, error bars, or how comparison parameters were chosen. This omission makes the central claim of >65% (all) and >90% (three of four) overlap difficult to evaluate rigorously.

    Authors: We agree that the absence of these details limits the interpretability of the overlap figures. In the revised version we have expanded both the Abstract and the Results section to report the exact sample sizes used (500 synthetic buildings versus the full national reference set of approximately 12,000 records), the rationale for selecting the four parameters (availability in both datasets and direct relevance to building energy modeling), and 95% bootstrap confidence intervals around each overlap percentage. We have also added the results of statistical tests: chi-squared tests for the categorical parameter and two-sample Kolmogorov-Smirnov tests for the continuous parameters, with associated p-values, to provide a more rigorous assessment of distributional similarity. revision: yes

Circularity Check

0 steps flagged

No significant circularity; validation uses external national reference independent of model inputs

full rationale

The paper describes a multimodal generative pipeline that ingests public county records and images to produce synthetic building data, then compares output distributions to a separate national reference dataset for overlap metrics (>65% all parameters, >90% for three of four). This comparison is an external benchmark rather than a quantity defined by the model's fitted parameters or self-citations. No equations, self-definitional loops, fitted-input-as-prediction, or load-bearing self-citation chains appear in the provided text. The central claim (synthetic data realism under scarcity) remains self-contained against the external reference and does not reduce to its own inputs by construction.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

Abstract-only review limits visibility into exact parameters; the framework likely relies on standard generative model hyperparameters and prompt engineering choices that function as free parameters, plus domain assumptions about public data sufficiency.

free parameters (1)
  • Vision-language model selection and prompting strategy
    Chosen to achieve stronger visual focus; specific hyperparameters or fine-tuning details not stated in abstract.
axioms (1)
  • domain assumption Public county records and images contain sufficient information to generate realistic building parameter distributions
    Invoked implicitly as the input source for the generative pipeline.

pith-pipeline@v0.9.0 · 5762 in / 1251 out tokens · 34096 ms · 2026-05-18T17:06:48.156174+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

47 extracted references · 47 canonical work pages · 7 internal anchors

  1. [1]

    Energy Information Administration, Drivers of U.S

    U.S. Energy Information Administration, Drivers of U.S. household energy consumption, 1980–2009, Tech. rep., U.S. Department of Energy, https://www.eia.gov/analysis/studies/buildings/households/ (Feb. 2015)

  2. [2]

    Environmental Protection Agency, Climate Change Indicators: Residential Energy Use,https://www.epa.gov/climate-indicators/ climate-change-indicators-residential-energy-use(Jan

    U.S. Environmental Protection Agency, Climate Change Indicators: Residential Energy Use,https://www.epa.gov/climate-indicators/ climate-change-indicators-residential-energy-use(Jan. 2025)

  3. [3]

    E. F. Bompard, S. Conti, M. J. Masera, G. G. Soma, A New Electricity Infrastructure for Fostering Urban Sustainability: Challenges and Emerg- ing Trends, Energies 17 (22) (2024) 5573.doi:10.3390/en17225573

  4. [4]

    Perera, K

    A. Perera, K. Javanroodi, V. M. Nik, Climate resilient interconnected infrastructure: Co-optimization of energy systems and urban morphology, Applied Energy 285 (2021) 116430. doi:10.1016/j.apenergy.2020. 116430

  5. [5]

    Y. Zeng, Y. Cai, G. Huang, J. Dai, A Review on Optimization Modeling of Energy Systems Planning and GHG Emission Mitigation under Un- certainty, Energies 4 (10) (2011) 1624–1656.doi:10.3390/en4101624

  6. [6]

    Hajri, R

    A. Hajri, R. Garay-Marinez, A. M. Macarulla, M. A. Ben Sassi, Data- driven model for heat load prediction in buildings connected to district heating networks, Energy 329 (2025) 136684.doi:10.1016/j.energy. 2025.136684

  7. [7]

    Department of Energy, Getting Started (3 2025)

    U.S. Department of Energy, Getting Started (3 2025). URL https://energyplus.net/assets/nrel_custom/pdfs/pdfs_ v25.1.0/GettingStarted.pdf

  8. [8]

    D. Wan, X. Zhao, W. Lu, P. Li, X. Shi, H. Fukuda, A Deep Learning Approach toward Energy-Effective Residential Building Floor Plan Gen- eration, Sustainability 14 (13) (2022) 8074.doi:10.3390/su14138074

  9. [9]

    M. H. Elnabawi, N. Hamza, A Methodology of Creating a Synthetic, Urban-Specific Weather Dataset Using a Microclimate Model for Build- ing Energy Modelling, Buildings 12 (9) (2022) 1407. doi:10.3390/ buildings12091407. 25

  10. [10]

    S. Lee, J. Cha, M. K. Kim, K. S. Kim, V. H. Pham, M. Leach, Neural- Network-Based Building Energy Consumption Prediction with Training Data Generation, Processes 7 (10) (2019) 731.doi:10.3390/pr7100731

  11. [11]

    M. H. Zweig, G. Campbell, Receiver-operating characteristic (ROC) plots: A fundamental evaluation tool in clinical medicine, Clinical Chemistry 39 (4) (1993) 561–577.doi:10.1093/clinchem/39.4.561

  12. [12]

    A. C. J. W. Janssens, F. K. Martens, Reflection on modern methods: Revisiting the area under the ROC Curve, International Journal of Epidemiology 49 (4) (2020) 1397–1403.doi:10.1093/ije/dyz274

  13. [13]

    Stinner, M

    F. Stinner, M. Wiecek, M. Baranski, A. Kümpel, D. Müller, Automatic digital twin data model generation of building energy systems from piping and instrumentation diagrams (Aug. 2021).arXiv:2108.13912, doi:10.48550/arXiv.2108.13912

  14. [14]

    Agostinelli, F

    S. Agostinelli, F. Cumo, G. Guidi, C. Tomazzoli, Cyber-Physical Systems Improving Building Energy Management: Digital Twin and Artificial Intelligence, Energies 14 (8) (2021) 2338.doi:10.3390/en14082338

  15. [15]

    Francisco, N

    A. Francisco, N. Mohammadi, J. E. Taylor, Smart City Digital Twin– Enabled Energy Management: Toward Real-Time Urban Building Energy Benchmarking, Journal of Management in Engineering 36 (2) (2020) 04019045.doi:10.1061/(ASCE)ME.1943-5479.0000741

  16. [16]

    Belik, O

    M. Belik, O. Rubanenko, Implementation of Digital Twin for Increasing Efficiency of Renewable Energy Sources, Energies 16 (12) (2023) 4787. doi:10.3390/en16124787

  17. [17]

    H. Xu, F. Omitaomu, S. Sabri, S. Zlatanova, X. Li, Y. Song, Lever- aging generative AI for urban digital twins: A scoping review on the autonomous generation of urban data, scenarios, designs, and 3D city models for smart city advancement, Urban Informatics 3 (1) (2024) 29. doi:10.1007/s44212-024-00060-w

  18. [18]

    Dodge, J

    S. Dodge, J. Xu, B. Stenger, Parsing floor plan images, in: 2017 Fifteenth IAPR International Conference on Machine Vision Applications (MVA), IEEE, Nagoya, Japan, 2017, pp. 358–361.doi:10.23919/MVA.2017. 7986875. 26

  19. [19]

    Zhang, V

    L. Zhang, V. Ford, Z. Chen, J. Chen, Automatic Building Energy Model Development and Debugging Using Large Language Models Agentic Workflow, preprint (2024).doi:10.2139/ssrn.4864703

  20. [20]

    T. Xiao, P. Xu, Exploring automated energy optimization with un- structured building data: A multi-agent based framework leverag- ing large language models, Energy and Buildings 322 (2024) 114691. doi:10.1016/j.enbuild.2024.114691

  21. [21]

    Y. Lin, Y. Yao, J. Zhu, C. He, Application of Generative AI in Predictive Analysis of Urban Energy Distribution and Traffic Congestion in Smart Cities, in: 2025 IEEE International Conference on Electronics, Energy Systems and Power Engineering (EESPE), IEEE, Shenyang, China, 2025, pp. 765–768.doi:10.1109/EESPE63401.2025.10987500

  22. [22]

    Z. Sha, W. Yue, S. Wang, N. Cheng, J. Wu, C. Li, Generative AI- Enabled Sensing and Communication Integration for Urban Air Mobil- ity, in: 2024 IEEE 99th Vehicular Technology Conference (VTC2024- Spring), IEEE, Singapore, Singapore, 2024, pp. 1–5. doi:10.1109/ VTC2024-Spring62846.2024.10683276

  23. [23]

    Zhang, A

    Y. Zhang, A. Schlüter, C. Waibel, SolarGAN: Synthetic Annual Solar Irradiance Time Series on Urban Building Facades via Deep Generative Networks (Jun. 2022).arXiv:2206.00747, doi:10.48550/arXiv.2206. 00747

  24. [24]

    M. Liu, L. Zhang, J. Chen, W.-A. Chen, Z. Yang, L. J. Lo, J. Wen, Z. O’Neill, Large language models for building energy applications: Op- portunities and challenges, Building Simulation 18 (2) (2025) 225–234. doi:10.1007/s12273-025-1235-9

  25. [25]

    Rehmann, M

    F. Rehmann, M. Mosteiro-Romero, C. Miller, R. Streblow, Enhancing urban energy modeling: A case study of data acquisition, enrichment, and evaluation in Berlin, Energy and Buildings 346 (2025) 116070. doi:10.1016/j.enbuild.2025.116070. URL https://linkinghub.elsevier.com/retrieve/pii/ S037877882500800X

  26. [26]

    T. Guo, M. Bachmann, M. Kersten, M. Kriegel, A combined work- flow to generate citywide building energy demand profiles from 27 low-level datasets, Sustainable Cities and Society 96 (2023) 104694. doi:10.1016/j.scs.2023.104694. URL https://linkinghub.elsevier.com/retrieve/pii/ S2210670723003050

  27. [27]

    Bishop, P

    D. Bishop, P. Gallardo, B. L. M. Williams, A Review of Multi-Domain Urban Energy Modelling Data, Clean Energy and Sustainability 2 (4) (2024) 10016–10016.doi:10.70322/ces.2024.10016. URLhttps://www.sciepublish.com/article/pii/298

  28. [28]

    H. Liu, C. Li, Q. Wu, Y. J. Lee, Visual Instruction Tuning (Dec. 2023). arXiv:2304.08485,doi:10.48550/arXiv.2304.08485

  29. [29]

    OpenAI, Introducing GPT-4.1 in the API,https://openai.com/index/ gpt-4-1/(Apr. 2025)

  30. [30]

    D. B. Crawley, L. K. Lawrie, F. C. Winkelmann, W. Buhl, Y. Huang, C. O. Pedersen, R. K. Strand, R. J. Liesen, D. E. Fisher, M. J. Witte, J. Glazer, EnergyPlus: Creating a new-generation building en- ergy simulation program, Energy and Buildings 33 (4) (2001) 319–331. doi:10.1016/S0378-7788(00)00114-6

  31. [31]

    PyTorch 2: Faster Machine Learning Through Dynamic Python Bytecode Transformation and Graph Compilation

    J. Ansel, E. Yang, H. He, N. Gimelshein, A. Jain, M. Voznesensky, B. Bao, P. Bell, D. Berard, E. Burovski, G. Chauhan, A. Chourdia, W. Constable, A. Desmaison, Z. DeVito, E. Ellison, W. Feng, J. Gong, M. Gschwind, B. Hirsh, S. Huang, K. Kalambarkar, L. Kirsch, M. Lazos, M. Lezcano, Y. Liang, J. Liang, Y. Lu, C. K. Luk, B. Maher, Y. Pan, C. Puhrsch, M. Res...

  32. [32]

    T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A.Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz, J. Davison, S. Shleifer, P. Von Platen, C. Ma, Y. Jernite, J. Plu, C. Xu, T. Le Scao, S. Gugger, M. Drame, Q. Lhoest, A. Rush, Transformers: State-of-the-Art Natural Language 28 Processing, in: Proceedings of the 2020 Conference on Empirical Methods in...

  33. [33]

    dev/documentation/webdriver/, accessed 2025-08-06 (Nov

    Selenium, Selenium webdriver documentation,https://www.selenium. dev/documentation/webdriver/, accessed 2025-08-06 (Nov. 2024)

  34. [34]

    com/chromium.org/driver/, accessed: 2025-08-06 (Jun

    Google, Chromedriver - webdriver for chrome,https://sites.google. com/chromium.org/driver/, accessed: 2025-08-06 (Jun. 2023)

  35. [35]

    Learning Transferable Visual Models From Natural Language Supervision

    A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, G. Krueger, I. Sutskever, Learning Transferable Visual Models From Natural Language Supervision (Feb. 2021).arXiv:2103.00020,doi:10.48550/arXiv.2103.00020

  36. [36]

    arXiv preprint arXiv:2501.03895 , year=

    S. Zhang, Q. Fang, Z. Yang, Y. Feng, LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token (Mar. 2025). arXiv:2501.03895,doi:10.48550/arXiv.2501.03895

  37. [37]

    Z. Yang, L. Li, K. Lin, J. Wang, C.-C. Lin, Z. Liu, L. Wang, The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision) (Oct. 2023). arXiv:2309.17421,doi:10.48550/arXiv.2309.17421

  38. [38]

    J. Li, D. Li, S. Savarese, S. Hoi, BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models (Jun. 2023).arXiv:2301.12597,doi:10.48550/arXiv.2301.12597

  39. [39]

    D. Zhu, J. Chen, X. Shen, X. Li, M. Elhoseiny, MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models (Oct. 2023).arXiv:2304.10592,doi:10.48550/arXiv.2304.10592

  40. [40]

    Shorten, C

    C. Shorten, C. Pierse, T. B. Smith, E. Cardenas, A. Sharma, J. Tren- grove, B. van Luijt, StructuredRAG: JSON Response Formatting with Large Language Models (Aug. 2024). arXiv:2408.11061, doi: 10.48550/arXiv.2408.11061

  41. [41]

    santoshphilip, Eppy,https://github.com/santoshphilip/eppy (Oct. 2024). 29

  42. [42]

    Department of Energy, EnergyPlus Auxiliary Programs Documen- tation — Version 23.2.0, EnergyPlus, U.S

    U.S. Department of Energy, EnergyPlus Auxiliary Programs Documen- tation — Version 23.2.0, EnergyPlus, U.S. Department of Energy, see pages 45–48 forExpandObjectspreprocessor details (2023). URL https://energyplus.net/assets/nrel_custom/pdfs/pdfs_ v23.2.0/AuxiliaryPrograms.pdf

  43. [43]

    Evaluating feature importance estimates.arXiv preprint arXiv:1806.10758, 2018

    S. Hooker, D. Erhan, P.-J. Kindermans, B. Kim, A Benchmark for Interpretability Methods in Deep Neural Networks (Nov. 2019).arXiv: 1806.10758,doi:10.48550/arXiv.1806.10758

  44. [44]

    Bahri, H

    E. Balkir, I. Nejadgholi, K. Fraser, S. Kiritchenko, Necessity and Suf- ficiency for Explaining Text Classifiers: A Case Study in Hate Speech Detection, in: Proceedings of the 2022 Conference of the North Amer- ican Chapter of the Association for Computational Linguistics: Hu- man Language Technologies, Association for Computational Linguistics, Seattle, ...

  45. [45]

    Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

    N. Reimers, I. Gurevych, Sentence-BERT: Sentence Embeddings us- ing Siamese BERT-Networks (Aug. 2019). arXiv:1908.10084, doi: 10.48550/arXiv.1908.10084

  46. [46]

    T. M. D. Team, Matplotlib: Visualization with Python, Zenodo (May 2025).doi:10.5281/ZENODO.15375714

  47. [47]

    Y.-H.H.Tsai, S.Bai, P.P.Liang, J.Z.Kolter, L.-P.Morency, R.Salakhut- dinov, Multimodal Transformer for Unaligned Multimodal Language Se- quences (Jun. 2019). arXiv:1906.00295, doi:10.48550/arXiv.1906. 00295. Appendix A. Labeler Experimental Values In this appendix, we list the detailed experimental values used to test the labeler in Section 3.2. Tables A....