See the past: Time-Reversed Scene Reconstruction from Thermal Traces Using Visual Language Models

Jorge Bacca; Kebin Contreras; Luis Toscano-Palomino; Mauro Dalla Mura

arxiv: 2510.05408 · v2 · submitted 2025-10-06 · 💻 cs.CV · cs.AI

See the past: Time-Reversed Scene Reconstruction from Thermal Traces Using Visual Language Models

Kebin Contreras , Luis Toscano-Palomino , Mauro Dalla Mura , Jorge Bacca This is my paper

Pith reviewed 2026-05-18 09:31 UTC · model grok-4.3

classification 💻 cs.CV cs.AI

keywords thermal imagingscene reconstructionvisual language modelsdiffusion modelstime reversalheat tracesforensic analysis

0 comments

The pith

Thermal traces from human interactions enable reconstruction of past scenes up to 120 seconds earlier using visual language models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes a framework to reconstruct previous states of a scene by leveraging thermal images that capture residual heat from people. It combines visual language models with a diffusion process to generate consistent past images from current RGB and thermal pairs. One model describes the scene while the other directs the reconstruction to maintain semantic accuracy. This matters for applications like forensics because it allows seeing actions that have already faded from view in regular cameras.

Core claim

The authors claim that coupling visual-language models with a constrained diffusion process allows the recovery of plausible scene states from up to 120 seconds in the past, as shown in evaluations across three controlled scenarios.

What carries the argument

A constrained diffusion process guided by two visual language models, one for generating scene descriptions and the other for directing image reconstruction from thermal traces.

If this is right

Recovers scene states from a few seconds to 120 seconds earlier in controlled tests.
Ensures semantic and structural consistency in the reconstructed images.
Extends beyond RGB camera capabilities by using thermal traces as temporal codes.
Provides a first step toward time-reversed imaging in forensics and scene analysis.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Applying this to real-world uncontrolled environments might require adjustments for varying thermal decay rates.
Combining with other sensors could improve accuracy for longer time intervals.
Exploring the method on dynamic scenes with multiple interactions could test its scalability.

Load-bearing premise

Thermal traces encode enough distinguishable information about prior human interactions for visual language models to infer and reconstruct accurate past scene states.

What would settle it

Observing whether reconstructions fail when thermal data is replaced with random heat patterns while RGB remains unchanged would test if the thermal information is truly necessary and sufficient.

read the original abstract

Recovering the past from present observations is an intriguing challenge with potential applications in forensics and scene analysis. Thermal imaging, operating in the infrared range, provides access to otherwise invisible information. Since humans are typically warmer (37 C -98.6 F) than their surroundings, interactions such as sitting, touching, or leaning leave residual heat traces. These fading imprints serve as passive temporal codes, allowing for the inference of recent events that exceed the capabilities of RGB cameras. This work proposes a time-reversed reconstruction framework that uses paired RGB and thermal images to recover scene states from a few seconds earlier. The proposed approach couples Visual-Language Models (VLMs) with a constrained diffusion process, where one VLM generates scene descriptions and another guides image reconstruction, ensuring semantic and structural consistency. The method is evaluated in three controlled scenarios, demonstrating the feasibility of reconstructing plausible past frames up to 120 seconds earlier, providing a first step toward time-reversed imaging from thermal traces.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper sketches a VLM-guided diffusion pipeline to reconstruct scenes from fading thermal traces up to two minutes back, but the support stays at qualitative feasibility in three controlled cases.

read the letter

The main point is that this work explores recovering recent past events from residual heat left on surfaces, using thermal images paired with visual language models to steer a diffusion process. The idea targets forensics and scene analysis where RGB alone falls short, and the authors show it can produce plausible earlier frames in indoor tests. That combination of VLMs for description and constraint with diffusion for thermal inversion is the fresh element here, not something already covered in the cited thermal or reconstruction papers. It frames the problem cleanly and demonstrates that heat imprints can carry some timing information beyond visible light captures. The controlled demos make a reasonable case that the approach is at least feasible for short time windows like 120 seconds. The soft spots sit in the evaluation. No quantitative metrics appear, no error bars or accuracy numbers, and no comparisons to baselines such as RGB-only methods or standard inpainting. Without those, it is hard to tell whether the thermal signal itself drives the timing or whether the models are mostly filling in from general scene knowledge. The three scenarios are all lab-like, so real-world clutter, material differences, or longer delays remain untested. This kind of paper suits readers working on multimodal reconstruction or thermal sensing for security uses. Someone looking for early-stage ideas on passive temporal inference could pull useful framing from it, though it would need tighter validation to stand on its own. I would send it for peer review. The core direction has enough novelty to justify referee time, and the feedback can focus on adding the missing quantitative checks and broader testing.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes a time-reversed scene reconstruction framework that pairs current RGB images with thermal images to recover plausible past scene states up to 120 seconds earlier. Fading heat traces from human interactions are treated as passive temporal codes; one VLM generates scene descriptions while a second VLM constrains a diffusion process to enforce semantic and structural consistency. The approach is evaluated only through feasibility demonstrations in three controlled scenarios.

Significance. If the central claim holds under quantitative scrutiny, the work would represent a novel integration of VLMs with constrained diffusion for passive temporal inference from thermal data, with clear potential applications in forensics and scene analysis. The idea of using residual thermal imprints as distinguishable temporal signals beyond RGB is intriguing and could stimulate further research in time-reversed imaging, provided the thermal signal is shown to contribute information not already available from scene priors.

major comments (2)

[Abstract] Abstract: The evaluation is limited to 'three controlled scenarios' demonstrating 'feasibility' of reconstructing 'plausible past frames,' yet no quantitative metrics, error rates, baseline comparisons (e.g., RGB-only or non-VLM diffusion), or ground-truth past-frame distances are reported. This absence directly undermines verification of whether reconstructions recover actual prior states or merely generate semantically consistent scenes from VLM world knowledge.
[Method] Method description (inferred from abstract): The claim that the second VLM 'guides image reconstruction, ensuring semantic and structural consistency' lacks any specification of the constraint mechanism, loss terms, or how the diffusion process inverts thermal decay physics rather than defaulting to generic scene priors. Without these details, it is impossible to assess whether the thermal trace supplies the claimed distinguishable temporal information.

minor comments (1)

[Abstract] The abstract would benefit from explicit statements on the exact temporal windows tested and any failure cases observed in the controlled scenarios.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment below and indicate planned revisions to improve the manuscript.

read point-by-point responses

Referee: [Abstract] Abstract: The evaluation is limited to 'three controlled scenarios' demonstrating 'feasibility' of reconstructing 'plausible past frames,' yet no quantitative metrics, error rates, baseline comparisons (e.g., RGB-only or non-VLM diffusion), or ground-truth past-frame distances are reported. This absence directly undermines verification of whether reconstructions recover actual prior states or merely generate semantically consistent scenes from VLM world knowledge.

Authors: We agree that the current evaluation focuses on qualitative feasibility demonstrations without quantitative metrics or baselines. The manuscript positions the work as an initial proof-of-concept for thermal-trace-based time-reversed reconstruction. In revision we will add quantitative evaluations, including perceptual similarity metrics and comparisons to RGB-only and unconstrained diffusion baselines, along with ground-truth frame distances where controlled capture permits direct comparison. This will better isolate the contribution of the thermal signal. revision: yes
Referee: [Method] Method description (inferred from abstract): The claim that the second VLM 'guides image reconstruction, ensuring semantic and structural consistency' lacks any specification of the constraint mechanism, loss terms, or how the diffusion process inverts thermal decay physics rather than defaulting to generic scene priors. Without these details, it is impossible to assess whether the thermal trace supplies the claimed distinguishable temporal information.

Authors: The full manuscript provides a high-level description of the VLM-constrained diffusion; however, we acknowledge that explicit technical details on the constraint formulation are needed. In the revised version we will expand the method section to specify the constraint mechanism, including the loss terms that incorporate the thermal trace and scene description, and clarify how the process uses residual heat decay to guide temporal inference beyond generic priors. revision: yes

Circularity Check

0 steps flagged

No circularity: framework is a procedural combination of existing VLMs and diffusion without self-referential derivations or fitted predictions

full rationale

The paper presents a time-reversed reconstruction approach that pairs RGB-thermal inputs with VLMs for description and constrained diffusion for image generation. No equations, parameter fits, or derivations are described that reduce outputs to inputs by construction. The central claim rests on the empirical feasibility of using fading thermal traces as temporal signals in controlled scenarios, which is an external assumption open to validation rather than a definitional loop. No self-citations are invoked as load-bearing uniqueness theorems, and the method is explicitly positioned as a first step combining known components. This satisfies the criteria for a self-contained, non-circular presentation.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that thermal traces from human interactions contain usable temporal information and that VLMs can reliably translate this into consistent image reconstructions without introducing new physical entities.

axioms (1)

domain assumption Thermal traces from human interactions (sitting, touching, leaning) provide distinguishable and reconstructible information about past scene states.
Invoked in the abstract when stating that fading imprints serve as passive temporal codes allowing inference of recent events.

pith-pipeline@v0.9.0 · 5709 in / 1222 out tokens · 32413 ms · 2026-05-18T09:31:04.237064+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/ArrowOfTime.lean arrow_from_z unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

These fading imprints serve as passive temporal codes, allowing for the inference of recent events that exceed the capabilities of RGB cameras.
IndisputableMonolith/Foundation/AlphaCoordinateFixation.lean J_uniquely_calibrated_via_higher_derivative unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The proposed approach couples Visual-Language Models (VLMs) with a constrained diffusion process

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

24 extracted references · 24 canonical work pages · 3 internal anchors

[1]

See the past: Time-Reversed Scene Reconstruction from Thermal Traces Using Visual Language Models

INTRODUCTION Thermal cameras measure long-wave infrared radiation (≈ 8–14µm), capturing temperature distributions rather than re- flected visible light [16]. Unlike RGB sensors, which record instantaneous intensity values in the visible range, thermal imaging provides access to heat transfer processes that of- ten persist after an interaction has ended. T...

work page internal anchor Pith review Pith/arXiv arXiv 2025
[2]

METHODOLOGY 2.1. Problem Formulation The problem of time-reversed scene reconstruction is formu- lated as the task of inferring a plausible past imagex t−∆ RGB from current static multimodal observations. Specifically, ac- cess is assumed to an RGB framext RGB ∈R h×w×3 and a co- registered thermal measurementx t T hermal ∈R h×w capturing residual heat tra...

work page
[3]

what-just-happened

SIMULA TIONS AND RESULTS To validate the proposed method, a dataset was constructed comprising three controlled scenarios:sitting on a chair, touching an object, andleaning against a wall. In each case, a person maintained physical contact with the scene for 30 seconds, after which RGB and thermal images were acquired at multiple time delays (5s,15s,30s,1...

work page
[4]

the person was sitting and holding the book

Cross-check with the RGB image and locate every object with heat traces. For each object, provide: Object type, Ob- ject color, Position (left, center, right), Interaction with the person (touching, sitting, holding, near, none), Direction rel- ative to the scene (front, back, left, right, corner) Final output: Provide only one short, direct sentence in p...

work page
[5]

To our knowl- edge, this is the first attempt to treat fading thermal imprints as temporal codes for scene reconstruction

CONCLUSIONS AND FUTURE WORK This work presents a proof-of-concept framework for recon- structing recent past events by combining RGB and thermal imaging with VLM-guided diffusion models. To our knowl- edge, this is the first attempt to treat fading thermal imprints as temporal codes for scene reconstruction. Controlled exper- iments validate the feasibili...

work page
[6]

Stephen Batifol, Andreas Blattmann, Frederic Boesel, Sak- sham Consul, Cyril Diagne, Tim Dockhorn, Jack English, Zion English, Patrick Esser, Sumith Kulal, et al. Flux. 1 kontext: Flow matching for in-context image generation and editing in latent space.arXiv e-prints, pages arXiv–2506, 2025

work page 2025
[7]

Background-subtraction using contour-based fusion of thermal and visible imagery

James W Davis and Vinay Sharma. Background-subtraction using contour-based fusion of thermal and visible imagery. Computer vision and image understanding, 106(2-3):162–182, 2007

work page 2007
[8]

Thermal remote sensing for land surface temperature monitoring: Maraqeh county, iran

Bakhtiar Feizizadeh and Thomas Blaschke. Thermal remote sensing for land surface temperature monitoring: Maraqeh county, iran. In2012 IEEE International Geoscience and Re- mote Sensing Symposium, pages 2217–2220. IEEE, 2012

work page 2012
[9]

Introducing gemini 2.5 flash image (aka nano banana).https://developers.googleblog.com/ en/introducing-gemini-2-5-flash-image/,

Google. Introducing gemini 2.5 flash image (aka nano banana).https://developers.googleblog.com/ en/introducing-gemini-2-5-flash-image/,

work page
[11]

Coded aperture design for compressive spectral subspace cluster- ing.IEEE Journal of Selected Topics in Signal Processing, 12(6):1589–1600, 2018

Carlos Hinojosa, Jorge Bacca, and Henry Arguello. Coded aperture design for compressive spectral subspace cluster- ing.IEEE Journal of Selected Topics in Signal Processing, 12(6):1589–1600, 2018

work page 2018
[12]

Video frame synthesis using deep voxel flow

Ziwei Liu, Raymond A Yeh, Xiaoou Tang, Yiming Liu, and Aseem Agarwala. Video frame synthesis using deep voxel flow. InProceedings of the IEEE international conference on computer vision, pages 4463–4471, 2017

work page 2017
[13]

Deep Predictive Coding Networks for Video Prediction and Unsupervised Learning

William Lotter, Gabriel Kreiman, and David Cox. Deep pre- dictive coding networks for video prediction and unsupervised learning.arXiv preprint arXiv:1605.08104, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[14]

Generalized recorrupted-to-recorrupted: Self-supervised learning beyond gaussian noise

Brayan Monroy, Jorge Bacca, and Juli´an Tachella. Generalized recorrupted-to-recorrupted: Self-supervised learning beyond gaussian noise. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 28155–28164, 2025

work page 2025
[15]

Core temperature mea- surement: methods and current insights.Sports medicine, 32(14):879–885, 2002

Daniel S Moran and Liran Mendal. Core temperature mea- surement: methods and current insights.Sports medicine, 32(14):879–885, 2002

work page 2002
[16]

Dall·e 3.https://openai.com/dall-e-3,

OpenAI. Dall·e 3.https://openai.com/dall-e-3,

work page
[17]

Accessed: 2025-09-30

work page 2025
[18]

Pixverse ai video generator.https://app

PixVerse. Pixverse ai video generator.https://app. pixverse.ai, 2025. Accessed: 2025-09-30

work page 2025
[19]

Infrared thermal imaging in medicine.Physiological measurement, 33(3):R33, 2012

EFJ Ring and Kurt Ammer. Infrared thermal imaging in medicine.Physiological measurement, 33(3):R33, 2012

work page 2012
[20]

High-resolution image synthesis with latent diffusion models

Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bj ¨orn Ommer. High-resolution image synthesis with latent diffusion models. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10684–10695, 2022

work page 2022
[21]

Seedream 4.0: Toward Next-generation Multimodal Image Generation

Team Seedream, Yunpeng Chen, Yu Gao, Lixue Gong, Meng Guo, Qiushan Guo, Zhiyao Guo, Xiaoxia Hou, Weilin Huang, Yixuan Huang, et al. Seedream 4.0: Toward next- generation multimodal image generation.arXiv preprint arXiv:2509.20427, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[22]

What happened 3 seconds ago? inferring the past with thermal imag- ing

Zitian Tang, Wenjie Ye, Wei-Chiu Ma, and Hang Zhao. What happened 3 seconds ago? inferring the past with thermal imag- ing. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 17111–17120, 2023

work page 2023
[23]

John Wiley & Sons, 2018

Michael V ollmer and Klaus-Peter M¨ollmann.Infrared thermal imaging: fundamentals, research and applications. John Wiley & Sons, 2018

work page 2018
[24]

Grok.https://x.ai, 2025

xAI. Grok.https://x.ai, 2025. Accessed: 2025-09-30

work page 2025
[25]

Crevnet: Conditionally reversible video prediction.arXiv preprint arXiv:1910.11577, 2019

Wei Yu, Yichao Lu, Steve Easterbrook, and Sanja Fidler. Crevnet: Conditionally reversible video prediction.arXiv preprint arXiv:1910.11577, 2019

work page arXiv 1910

[1] [1]

See the past: Time-Reversed Scene Reconstruction from Thermal Traces Using Visual Language Models

INTRODUCTION Thermal cameras measure long-wave infrared radiation (≈ 8–14µm), capturing temperature distributions rather than re- flected visible light [16]. Unlike RGB sensors, which record instantaneous intensity values in the visible range, thermal imaging provides access to heat transfer processes that of- ten persist after an interaction has ended. T...

work page internal anchor Pith review Pith/arXiv arXiv 2025

[2] [2]

METHODOLOGY 2.1. Problem Formulation The problem of time-reversed scene reconstruction is formu- lated as the task of inferring a plausible past imagex t−∆ RGB from current static multimodal observations. Specifically, ac- cess is assumed to an RGB framext RGB ∈R h×w×3 and a co- registered thermal measurementx t T hermal ∈R h×w capturing residual heat tra...

work page

[3] [3]

what-just-happened

SIMULA TIONS AND RESULTS To validate the proposed method, a dataset was constructed comprising three controlled scenarios:sitting on a chair, touching an object, andleaning against a wall. In each case, a person maintained physical contact with the scene for 30 seconds, after which RGB and thermal images were acquired at multiple time delays (5s,15s,30s,1...

work page

[4] [4]

the person was sitting and holding the book

Cross-check with the RGB image and locate every object with heat traces. For each object, provide: Object type, Ob- ject color, Position (left, center, right), Interaction with the person (touching, sitting, holding, near, none), Direction rel- ative to the scene (front, back, left, right, corner) Final output: Provide only one short, direct sentence in p...

work page

[5] [5]

To our knowl- edge, this is the first attempt to treat fading thermal imprints as temporal codes for scene reconstruction

CONCLUSIONS AND FUTURE WORK This work presents a proof-of-concept framework for recon- structing recent past events by combining RGB and thermal imaging with VLM-guided diffusion models. To our knowl- edge, this is the first attempt to treat fading thermal imprints as temporal codes for scene reconstruction. Controlled exper- iments validate the feasibili...

work page

[6] [6]

Stephen Batifol, Andreas Blattmann, Frederic Boesel, Sak- sham Consul, Cyril Diagne, Tim Dockhorn, Jack English, Zion English, Patrick Esser, Sumith Kulal, et al. Flux. 1 kontext: Flow matching for in-context image generation and editing in latent space.arXiv e-prints, pages arXiv–2506, 2025

work page 2025

[7] [7]

Background-subtraction using contour-based fusion of thermal and visible imagery

James W Davis and Vinay Sharma. Background-subtraction using contour-based fusion of thermal and visible imagery. Computer vision and image understanding, 106(2-3):162–182, 2007

work page 2007

[8] [8]

Thermal remote sensing for land surface temperature monitoring: Maraqeh county, iran

Bakhtiar Feizizadeh and Thomas Blaschke. Thermal remote sensing for land surface temperature monitoring: Maraqeh county, iran. In2012 IEEE International Geoscience and Re- mote Sensing Symposium, pages 2217–2220. IEEE, 2012

work page 2012

[9] [9]

Introducing gemini 2.5 flash image (aka nano banana).https://developers.googleblog.com/ en/introducing-gemini-2-5-flash-image/,

Google. Introducing gemini 2.5 flash image (aka nano banana).https://developers.googleblog.com/ en/introducing-gemini-2-5-flash-image/,

work page

[10] [11]

Coded aperture design for compressive spectral subspace cluster- ing.IEEE Journal of Selected Topics in Signal Processing, 12(6):1589–1600, 2018

Carlos Hinojosa, Jorge Bacca, and Henry Arguello. Coded aperture design for compressive spectral subspace cluster- ing.IEEE Journal of Selected Topics in Signal Processing, 12(6):1589–1600, 2018

work page 2018

[11] [12]

Video frame synthesis using deep voxel flow

Ziwei Liu, Raymond A Yeh, Xiaoou Tang, Yiming Liu, and Aseem Agarwala. Video frame synthesis using deep voxel flow. InProceedings of the IEEE international conference on computer vision, pages 4463–4471, 2017

work page 2017

[12] [13]

Deep Predictive Coding Networks for Video Prediction and Unsupervised Learning

William Lotter, Gabriel Kreiman, and David Cox. Deep pre- dictive coding networks for video prediction and unsupervised learning.arXiv preprint arXiv:1605.08104, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[13] [14]

Generalized recorrupted-to-recorrupted: Self-supervised learning beyond gaussian noise

Brayan Monroy, Jorge Bacca, and Juli´an Tachella. Generalized recorrupted-to-recorrupted: Self-supervised learning beyond gaussian noise. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 28155–28164, 2025

work page 2025

[14] [15]

Core temperature mea- surement: methods and current insights.Sports medicine, 32(14):879–885, 2002

Daniel S Moran and Liran Mendal. Core temperature mea- surement: methods and current insights.Sports medicine, 32(14):879–885, 2002

work page 2002

[15] [16]

Dall·e 3.https://openai.com/dall-e-3,

OpenAI. Dall·e 3.https://openai.com/dall-e-3,

work page

[16] [17]

Accessed: 2025-09-30

work page 2025

[17] [18]

Pixverse ai video generator.https://app

PixVerse. Pixverse ai video generator.https://app. pixverse.ai, 2025. Accessed: 2025-09-30

work page 2025

[18] [19]

Infrared thermal imaging in medicine.Physiological measurement, 33(3):R33, 2012

EFJ Ring and Kurt Ammer. Infrared thermal imaging in medicine.Physiological measurement, 33(3):R33, 2012

work page 2012

[19] [20]

High-resolution image synthesis with latent diffusion models

Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bj ¨orn Ommer. High-resolution image synthesis with latent diffusion models. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10684–10695, 2022

work page 2022

[20] [21]

Seedream 4.0: Toward Next-generation Multimodal Image Generation

Team Seedream, Yunpeng Chen, Yu Gao, Lixue Gong, Meng Guo, Qiushan Guo, Zhiyao Guo, Xiaoxia Hou, Weilin Huang, Yixuan Huang, et al. Seedream 4.0: Toward next- generation multimodal image generation.arXiv preprint arXiv:2509.20427, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[21] [22]

What happened 3 seconds ago? inferring the past with thermal imag- ing

Zitian Tang, Wenjie Ye, Wei-Chiu Ma, and Hang Zhao. What happened 3 seconds ago? inferring the past with thermal imag- ing. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 17111–17120, 2023

work page 2023

[22] [23]

John Wiley & Sons, 2018

Michael V ollmer and Klaus-Peter M¨ollmann.Infrared thermal imaging: fundamentals, research and applications. John Wiley & Sons, 2018

work page 2018

[23] [24]

Grok.https://x.ai, 2025

xAI. Grok.https://x.ai, 2025. Accessed: 2025-09-30

work page 2025

[24] [25]

Crevnet: Conditionally reversible video prediction.arXiv preprint arXiv:1910.11577, 2019

Wei Yu, Yichao Lu, Steve Easterbrook, and Sanja Fidler. Crevnet: Conditionally reversible video prediction.arXiv preprint arXiv:1910.11577, 2019

work page arXiv 1910