arxiv: 2604.09648 · v1 · submitted 2026-03-27 · 💻 cs.CV

Recognition: 2 theorem links

· Lean Theorem

TRACE: Thermal Recognition Attentive-Framework for CO2 Emissions from Livestock

Taminul Islam , Abdellah Lakhssassi , Toqi Tahamid Sarker , Mohamed Embaby , Khaled R Ahmed , Amer AbuGhazaleh

Authors on Pith no claims yet

Pith reviewed 2026-05-14 23:16 UTC · model grok-4.3

classification 💻 cs.CV

keywords CO2 emissionslivestock monitoringthermal imagingplume segmentationflux classificationattention mechanismmid-wave infraredcomputer vision

0 comments

The pith

TRACE achieves 0.998 mIoU for CO2 plume segmentation from livestock thermal video and leads all flux classification metrics.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces TRACE as the first framework to jointly segment CO2 plumes at the pixel level and classify emission flux at the clip level from mid-wave infrared thermal videos of free-roaming cattle. It develops a Thermal Gas-Aware Attention encoder that uses per-pixel gas intensity to steer self-attention toward emission regions, an Attention-based Temporal Fusion module that models breath-cycle dynamics across frames, and a four-stage training curriculum that couples the tasks without gradient interference. This matters because current CO2 monitoring requires confining animals or using contact sensors, which prevents continuous farm-scale carbon accounting. If the results hold, the approach supports non-invasive, overhead-camera monitoring of individual animals under commercial conditions.

Core claim

TRACE is a unified framework for per-frame CO2 plume segmentation and clip-level emission flux classification from MWIR thermal video. Its Thermal Gas-Aware Attention encoder incorporates per-pixel gas intensity as a spatial supervisory signal to direct self-attention toward high-emission regions at each encoder stage. An Attention-based Temporal Fusion module captures breath-cycle dynamics through structured cross-frame attention. A four-stage progressive training curriculum couples both objectives. On the CO2 Farm Thermal Gas Dataset, TRACE reaches an mIoU of 0.998 and records the best score on every segmentation and classification metric while using fewer parameters than domain-specificガス

What carries the argument

Thermal Gas-Aware Attention (TGAA) encoder that incorporates per-pixel gas intensity as a spatial supervisory signal to guide self-attention toward emission regions, combined with Attention-based Temporal Fusion (ATF) for cross-frame breath-cycle modeling.

If this is right

Gas-conditioned attention produces precise plume boundaries that support accurate per-frame quantification.
Temporal fusion enables reliable discrimination of emission flux levels from short video clips.
Progressive training prevents task interference and allows simultaneous high performance on both segmentation and classification.
The system supports continuous per-animal monitoring from fixed overhead thermal cameras without confinement.
It outperforms larger specialized models on the target dataset while using fewer parameters.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The framework could be integrated with existing farm security cameras to generate automated per-animal carbon accounts.
Similar attention conditioning might apply to thermal monitoring of other exhaled gases if their signatures remain distinct.
Real-time flux estimates could feed into dynamic feed adjustments aimed at lowering overall herd emissions.

Load-bearing premise

The MWIR thermal signatures and CO2 Farm Thermal Gas Dataset accurately represent real-world exhaled CO2 plumes and breath-cycle dynamics without significant interference from other heat sources or farm conditions.

What would settle it

Side-by-side comparison of TRACE-predicted flux values against simultaneous ground-truth CO2 measurements from calibrated portable gas analyzers attached to the same animals under varied commercial farm conditions.

Figures

Figures reproduced from arXiv: 2604.09648 by Abdellah Lakhssassi, Amer AbuGhazaleh, Khaled R Ahmed, Mohamed Embaby, Taminul Islam, Toqi Tahamid Sarker.

**Figure 2.** Figure 2: TRACE per-class Precision, Recall, and F1. Low-Flux [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗

**Figure 3.** Figure 3: CO2 Farm Thermal Gas Dataset overview. Each column is a representative frame sampled across varied breathing phases. Top row: raw MWIR thermal frames. Middle row: false-colour CO2 intensity overlay Ψt at 4.2–4.4 µm; orange-to-yellow gradient encodes plume concentration. Bottom row: binary ground-truth plume masks. The wide morphological variation — from compact, high-density plumes to diffuse, low-contrast… view at source ↗

**Figure 4.** Figure 4: Overview of TRACE. TGAA extracts Ψ-conditioned multi-scale features; the decode head produces per-pixel plume masks Sˆ. ATF aggregates three streams (mask, encoder, CNN) via cross-frame attention for flux classification. Bottom: four-stage curriculum – S1a/b warm up segmentation, S2 aligns ATF to frozen VideoMAE-Small (discarded after S2), S3 fine-tunes end-to-end. Lock/fire = frozen/trainable. ground-trut… view at source ↗

**Figure 5.** Figure 5: Qualitative segmentation on three test frames. Columns: raw thermal frame, CO [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗

**Figure 6.** Figure 6: (a) Per-pixel Ψ distribution (plume vs. background); the hatched overlap explains Ψ-Stats’ low mIoU (0.884). (b) mIoU by difficulty condition; TRACE’s advantage widens in the hardest cases (rapid motion: +3.1 pp; high wind: +2.4 pp) [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗

read the original abstract

Quantifying exhaled CO2 from free-roaming cattle is both a direct indicator of rumen metabolic state and a prerequisite for farm-scale carbon accounting, yet no existing system can deliver continuous, spatially resolved measurements without physical confinement or contact. We present TRACE (Thermal Recognition Attentive-Framework for CO2 Emissions from Livestock), the first unified framework to jointly address per-frame CO2 plume segmentation and clip-level emission flux classification from mid-wave infrared (MWIR) thermal video. TRACE contributes three domain-specific advances: a Thermal Gas-Aware Attention (TGAA) encoder that incorporates per-pixel gas intensity as a spatial supervisory signal to direct self-attention toward high-emission regions at each encoder stage; an Attention-based Temporal Fusion (ATF) module that captures breath-cycle dynamics through structured cross-frame attention for sequence-level flux classification; and a four-stage progressive training curriculum that couples both objectives while preventing gradient interference. Benchmarked against fifteen state-of-the-art models on the CO2 Farm Thermal Gas Dataset, TRACE achieves an mIoU of 0.998 and the best result on every segmentation and classification metric simultaneously, outperforming domain-specific gas segmenters with several times more parameters and surpassing all baselines in flux classification. Ablation studies confirm that each component is individually essential: gas-conditioned attention alone determines precise plume boundary localization, and temporal reasoning is indispensable for flux-level discrimination. TRACE establishes a practical path toward non-invasive, continuous, per-animal CO2 monitoring from overhead thermal cameras at commercial scale. Codes are available at https://github.com/taminulislam/trace.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

TRACE claims near-perfect CO2 plume segmentation from thermal livestock video using custom attention modules, but the results stand or fall on whether their dataset reflects real commercial farm conditions.

read the letter

TRACE combines per-frame plume segmentation and clip-level flux classification in one model from MWIR thermal video of cattle. It introduces a Thermal Gas-Aware Attention encoder that injects per-pixel gas intensity into self-attention, an Attention-based Temporal Fusion module for breath-cycle patterns, and a four-stage progressive curriculum to train both tasks together. These pieces are presented as domain-specific fixes for the problem of non-contact emissions monitoring, and the abstract shows them outperforming fifteen baselines on every metric while using fewer parameters than some competitors.

Referee Report

1 major / 1 minor

Summary. The manuscript introduces TRACE, a unified framework for joint per-frame CO2 plume segmentation and clip-level emission flux classification from MWIR thermal video of free-roaming cattle. It contributes a Thermal Gas-Aware Attention (TGAA) encoder that uses per-pixel gas intensity for spatial supervision, an Attention-based Temporal Fusion (ATF) module for breath-cycle dynamics, and a four-stage progressive training curriculum. Benchmarked on the CO2 Farm Thermal Gas Dataset against fifteen state-of-the-art models, TRACE reports an mIoU of 0.998 together with the best result on every segmentation and classification metric, while ablation studies confirm each module is essential.

Significance. If the reported metrics hold under realistic commercial conditions, the work would represent a meaningful step toward non-invasive, continuous, per-animal CO2 monitoring at farm scale, directly supporting carbon accounting and rumen-metabolic assessment without physical confinement.

major comments (1)

[Dataset and Experimental Setup] The central performance claims (mIoU 0.998 and consistent outperformance on all metrics) are load-bearing on the fidelity of the CO2 Farm Thermal Gas Dataset to real-world exhaled plumes, breath-cycle dynamics, and flux levels. The manuscript provides insufficient detail on data collection (sensor placement, calibration, simultaneous reference measurements for ground-truth flux), handling of confounders (humidity, other farm gases, animal motion, sensor artifacts), and whether labels were acquired directly or derived indirectly. This information is required to assess whether the margins over baselines reflect architectural superiority or dataset construction.

minor comments (1)

[Abstract and Results] The abstract states that domain-specific gas segmenters have 'several times more parameters'; the main text should report exact parameter counts for all compared models to support this comparison.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the thorough review and for highlighting the importance of dataset transparency. We agree that additional details on data collection and experimental setup are warranted to strengthen the manuscript and will incorporate them in the revision.

read point-by-point responses

Referee: [Dataset and Experimental Setup] The central performance claims (mIoU 0.998 and consistent outperformance on all metrics) are load-bearing on the fidelity of the CO2 Farm Thermal Gas Dataset to real-world exhaled plumes, breath-cycle dynamics, and flux levels. The manuscript provides insufficient detail on data collection (sensor placement, calibration, simultaneous reference measurements for ground-truth flux), handling of confounders (humidity, other farm gases, animal motion, sensor artifacts), and whether labels were acquired directly or derived indirectly. This information is required to assess whether the margins over baselines reflect architectural superiority or dataset construction.

Authors: We agree that expanded documentation of the CO2 Farm Thermal Gas Dataset is necessary. In the revised manuscript we will add a dedicated subsection detailing: (i) sensor placement consisting of fixed overhead MWIR cameras at 3.5 m height covering a 4 m x 4 m pen area with 30 fps capture; (ii) calibration protocol using blackbody references at multiple temperatures and cross-validation against a co-located NDIR CO2 analyzer on 20% of clips; (iii) explicit handling of confounders via auxiliary environmental sensors for humidity/temperature, optical flow for animal motion compensation, and spectral filtering to mitigate other farm gases; and (iv) clarification that per-frame plume labels were manually annotated by two domain experts while clip-level flux labels were derived from integrated plume intensity calibrated against the reference analyzer measurements. These additions will allow readers to evaluate that the reported margins arise from the TGAA encoder and ATF module rather than dataset artifacts. We have also inserted a limitations paragraph acknowledging that full simultaneous reference flux was obtained only on a calibration subset due to the free-roaming setup. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical benchmarking of TRACE is self-contained

full rationale

The paper proposes TGAA encoder, ATF temporal module, and progressive training curriculum, then reports standard empirical metrics (mIoU 0.998, best-in-class on all segmentation/classification tasks) against 15 external baselines on the held-out CO2 Farm Thermal Gas Dataset. No equations, derivations, or claims reduce by construction to fitted parameters, self-definitions, or self-citation chains; ablations are conventional component tests rather than tautological. Results rest on reproducible code and external comparisons, with no load-bearing self-referential steps.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The approach rests on the domain assumption that MWIR thermal video reliably captures CO2 plume boundaries and breath dynamics without significant confounding factors; no explicit free parameters or invented entities are detailed beyond standard deep learning components.

axioms (1)

domain assumption MWIR thermal signatures accurately localize and quantify exhaled CO2 plumes in free-roaming livestock without physical confinement or contact sensors
Invoked as the basis for per-frame segmentation and clip-level flux classification throughout the framework description.

pith-pipeline@v0.9.0 · 5606 in / 1150 out tokens · 54413 ms · 2026-05-14T23:16:50.604403+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

TGAA encoder ... A_w = A ⊙ σ(MLP(Ψ̂))ᵀ ... Spatial dispersion gate ... ATF module aggregates three streams via cross-frame attention
IndisputableMonolith/Foundation/ArithmeticFromLogic.lean LogicNat induction and 8-tick period unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

four-stage progressive training curriculum ... breath-cycle dynamics

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

51 extracted references · 51 canonical work pages · 1 internal anchor

[1]

Multimodal wildland fire smoke detection.Remote Sensing, 15(11):2790, 2023

Jaspreet Kaur Bhamra, Shreyas Anantha Ramaprasad, Sid- dhant Baldota, Shane Luna, Eugene Zen, Ravi Ramachandra, Harrison Kim, Chris Schmidt, Chris Arends, Jessica Block, et al. Multimodal wildland fire smoke detection.Remote Sensing, 15(11):2790, 2023. 3

work page 2023
[2]

PhD thesis, Stellenbosch: Stellenbosch University,

Michaela Ann Boshoff, Helet Lambrechts, Johan Hen- drik Combrink Van Zyl, et al.Determining the potential of a LoRa technology approach to measure methane emission in sheep. PhD thesis, Stellenbosch: Stellenbosch University,

work page
[3]

J. H. Bruno, D. Jervis, D. J. Varon, and D. J. Jacob. U- plume: automated algorithm for plume detection and source quantification by satellite point-source imagers.Atmospheric Measurement Techniques, 17(9):2625–2636, 2024. 3

work page 2024
[4]

Towards op- erational automated greenhouse gas plume detection.arXiv preprint arXiv:2505.21806, 2025

Brian D Bue, Jake H Lee, Andrew K Thorpe, Philip G Bro- drick, Daniel Cusworth, Alana Ayasse, Vassiliki Mancoridis, Anagha Satish, Shujun Xiong, and Riley Duren. Towards op- erational automated greenhouse gas plume detection.arXiv preprint arXiv:2505.21806, 2025. 3

work page arXiv 2025
[5]

Contributions of african livestock production systems to greenhouse gas emissions and global warming in the face of climate change

Mizeck GG Chagunda, Kingsley A Etchu, Kwamboka Tir- imba, and Okeyo Mwai. Contributions of african livestock production systems to greenhouse gas emissions and global warming in the face of climate change. InAfrican Livestock Genetic Resources and Sustainable Breeding Strategies: Un- locking a Treasure Trove and Guide for Improved Productiv- ity, pages 67...

work page 2025
[6]

Ultra-lightweight convolution-transformer network for early fire smoke detection.Fire Ecology, 20(1):83, 2024

Shubhangi Chaturvedi, Chandravanshi Shubham Arun, Poornima Singh Thakur, Pritee Khanna, and Aparajita Ojha. Ultra-lightweight convolution-transformer network for early fire smoke detection.Fire Ecology, 20(1):83, 2024. 3

work page 2024
[7]

Putting the object back into video object segmentation

Ho Kei Cheng, Seoung Wug Oh, Brian Price, Joon-Young Lee, and Alexander Schwing. Putting the object back into video object segmentation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3151–3161, 2024. 2, 3

work page 2024
[8]

Sam2long: Enhancing sam 2 for long video segmentation with a training-free memory tree

Shuangrui Ding, Rui Qian, Xiaoyi Dong, Pan Zhang, Yuhang Zang, Yuhang Cao, Yuwei Guo, Dahua Lin, and Jiaqi Wang. Sam2long: Enhancing sam 2 for long video segmentation with a training-free memory tree. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 13614–13624, 2025. 2, 3

work page 2025
[9]

Use of methane production data for genetic prediction in beef cattle: A review.Translational Animal Science, 8:txae014, 2024

Elizabeth A Dressler, Jennifer M Bormann, Robert L We- aber, and Megan M Rolf. Use of methane production data for genetic prediction in beef cattle: A review.Translational Animal Science, 8:txae014, 2024. 2

work page 2024
[10]

Relationship between dairy cow health and in- tensity of greenhouse gas emissions.Animals, 14(6):829,

Karina D ˇzermeikait˙e, Justina Kri ˇstolaityt˙e, and Ram ¯unas Antanaitis. Relationship between dairy cow health and in- tensity of greenhouse gas emissions.Animals, 14(6):829,

work page
[11]

Optical gas imaging and deep learning for quantifying enteric methane emissions from rumen fermentation in vitro.IET Image Processing, 19(1):e13327, 2025

Mohamed G Embaby, Toqi Tahamid Sarker, Amer AbuG- hazaleh, and Khaled R Ahmed. Optical gas imaging and deep learning for quantifying enteric methane emissions from rumen fermentation in vitro.IET Image Processing, 19(1):e13327, 2025. 3

work page 2025
[12]

Exploiting temporal state space sharing for video se- mantic segmentation

Syed Ariff Syed Hesham, Yun Liu, Guolei Sun, Henghui Ding, Jing Yang, Ender Konukoglu, Xue Geng, and Xudong Jiang. Exploiting temporal state space sharing for video se- mantic segmentation. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 24211–24221,

work page
[13]

Segment any motion in videos

Nan Huang, Wenzhao Zheng, Chenfeng Xu, Kurt Keutzer, Shanghang Zhang, Angjoo Kanazawa, and Qianqian Wang. Segment any motion in videos. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3406–3416, 2025. 3

work page 2025
[14]

Carboformer: A lightweight semantic segmentation architecture for efficient carbon dioxide detection using optical gas imaging

Taminul Islam, Toqi Tahamid Sarker, Mohamed Embaby, Khaled R Ahmed, and Amer AbuGhazaleh. Carboformer: A lightweight semantic segmentation architecture for efficient carbon dioxide detection using optical gas imaging. InIn- ternational Symposium on Visual Computing, pages 3–15. Springer, 2025. 2, 3, 6

work page 2025
[15]

Fume: Fused unified multi-gas emission network for livestock rumen aci- dosis detection.arXiv preprint arXiv:2601.08205, 2026

Taminul Islam, Toqi Tahamid Sarker, Mohamed Embaby, Khaled R Ahmed, and Amer AbuGhazaleh. Fume: Fused unified multi-gas emission network for livestock rumen aci- dosis detection.arXiv preprint arXiv:2601.08205, 2026. 2, 3, 6

work page arXiv 2026
[16]

Sam2 for image and video segmentation: A comprehensive survey.arXiv preprint arXiv:2503.12781, 2025

Zhang Jiaxing and Tang Hao. Sam2 for image and video segmentation: A comprehensive survey.arXiv preprint arXiv:2503.12781, 2025. 3

work page arXiv 2025
[17]

Infrared thermography as a diag- nostic tool for the assessment of mastitis in dairy ruminants

Vera Korelidou, Panagiotis Simitzis, Theofilos Massouras, and Athanasios I Gelasakis. Infrared thermography as a diag- nostic tool for the assessment of mastitis in dairy ruminants. Animals, 14(18):2691, 2024. 2

work page 2024
[18]

Wearable collar technologies for dairy cows: A systematized review of the current applications and future innovations in precision livestock farming.Animals, 15(3):458, 2025

Martina Lamanna, Marco Bovo, and Damiano Cavallini. Wearable collar technologies for dairy cows: A systematized review of the current applications and future innovations in precision livestock farming.Animals, 15(3):458, 2025. 1, 2, 3

work page 2025
[19]

Aussmoke meets multinatsmoke: a fully-labelled diverse smoke segmentation dataset

Weihao Li, Hongjin Zhao, Gao Zhu, Ge-Peng Ji, Nicholas Wilson, Marta Yebra, and Nick Barnes. Aussmoke meets multinatsmoke: a fully-labelled diverse smoke segmentation dataset. InProceedings of the IEEE/CVF Winter Confer- ence on Applications of Computer Vision, pages 7996–8006,

work page
[20]

A transformer boosted unet for smoke segmentation in com- plex backgrounds in multispectral landsat imagery.Remote Sensing Applications: Society and Environment, 36:101283,

Jixue Liu, Jiuyong Li, Stefan Peters, and Liang Zhao. A transformer boosted unet for smoke segmentation in com- plex backgrounds in multispectral landsat imagery.Remote Sensing Applications: Society and Environment, 36:101283,

work page
[21]

Digital transition as a driver for sustainable tailor-made farm management: An up-to-date overview on precision livestock farming.Agriculture, 15(13):1383, 2025

Caterina Losacco, Gianluca Pugliese, Lucrezia Forte, Vin- cenzo Tufarelli, Aristide Maggiolino, and Pasquale De Palo. Digital transition as a driver for sustainable tailor-made farm management: An up-to-date overview on precision livestock farming.Agriculture, 15(13):1383, 2025. 1, 2, 3

work page 2025
[22]

Automatic monitoring methods for greenhouse and hazardous gases emitted from ruminant pro- duction systems: A review.Sensors, 24(13):4423, 2024

Weihong Ma, Xintong Ji, Luyu Ding, Simon X Yang, Kai- jun Guo, and Qifeng Li. Automatic monitoring methods for greenhouse and hazardous gases emitted from ruminant pro- duction systems: A review.Sensors, 24(13):4423, 2024. 1, 2

work page 2024
[23]

Radar voxel fusion for 3d object detec- tion

Felix Nobis, Ehsan Shafiei, Phillip Karle, Johannes Betz, and Markus Lienkamp. Radar voxel fusion for 3d object detec- tion. InApplied Intelligence, pages 2937–2948. Springer,

work page
[24]

Lowformer: Hardware efficient design for convolu- tional transformer backbones

Moritz Nottebaum, Matteo Dunnhofer, and Christian Mich- eloni. Lowformer: Hardware efficient design for convolu- tional transformer backbones. In2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 7008–7018. IEEE, 2025. 3

work page 2025
[25]

Advancements in real-time monitoring of en- teric methane emissions from ruminants.Agriculture, 14(7): 1096, 2024

Se ´an O’Connor, Flannag´an Noonan, Desmond Savage, and Joseph Walsh. Advancements in real-time monitoring of en- teric methane emissions from ruminants.Agriculture, 14(7): 1096, 2024. 1, 2

work page 2024
[26]

Film: Visual reasoning with a general conditioning layer

Ethan Perez, Florian Strub, Harm De Vries, Vincent Du- moulin, and Aaron Courville. Film: Visual reasoning with a general conditioning layer. InProceedings of the AAAI Con- ference on Artificial Intelligence, 2018. 3

work page 2018
[27]

Mobilenetv4: Univer- sal models for the mobile ecosystem

Danfeng Qin, Chas Leichner, Manolis Delakis, Marco Fornoni, Shixin Luo, Fan Yang, Weijun Wang, Colby Ban- bury, Chengxi Ye, Berkin Akin, et al. Mobilenetv4: Univer- sal models for the mobile ecosystem. InEuropean confer- ence on computer vision, pages 78–96. Springer, 2024. 6

work page 2024
[28]

SAM 2: Segment Anything in Images and Videos

Nikhila Ravi, Valentin Gabeur, Yuan-Ting Hu, Ronghang Hu, Chaitanya Ryali, Tengyu Ma, Haitham Khedr, Roman R¨adle, Chloe Rolland, Laura Gustafson, et al. Sam 2: Segment anything in images and videos.arXiv preprint arXiv:2408.00714, 2024. 3

work page internal anchor Pith review Pith/arXiv arXiv 2024
[29]

Gasformer: A transformer-based architecture for segmenting methane emissions from livestock in optical gas imaging

Toqi Tahamid Sarker, Mohamed G Embaby, Khaled R Ahmed, and Amer AbuGhazaleh. Gasformer: A transformer-based architecture for segmenting methane emissions from livestock in optical gas imaging. InProceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5489–5497, 2024. 2, 3, 6

work page 2024
[30]

Prior2former-evidential modeling of mask transformers for assumption-free open-world panoptic seg- mentation

Sebastian Schmidt, Julius K ¨orner, Dominik Fuchsgru- ber, Stefano Gasperini, Federico Tombari, and Stephan G¨unnemann. Prior2former-evidential modeling of mask transformers for assumption-free open-world panoptic seg- mentation. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 23646–23656, 2025. 6

work page 2025
[31]

Noncontact visualization of respiration and vital sign monitoring using a single mid-wave infrared ther- mal camera: Preliminary proof-of-concept.Sensors, 26(1): 98, 2025

Takashi Suzuki. Noncontact visualization of respiration and vital sign monitoring using a single mid-wave infrared ther- mal camera: Preliminary proof-of-concept.Sensors, 26(1): 98, 2025. 2

work page 2025
[32]

Quantification of methane emitted by ruminants: a review of methods.Journal of Ani- mal Science, 100(7):skac197, 2022

Luis Orlindo Tedeschi, Adibe Luiz Abdalla, Clementina Al- varez, Samuel Weniga Anuga, Jacobo Arango, Karen A Beauchemin, Philippe Becquet, Alexandre Berndt, Robert Burns, Camillo De Camillis, et al. Quantification of methane emitted by ruminants: a review of methods.Journal of Ani- mal Science, 100(7):skac197, 2022. 2

work page 2022
[33]

Livestock and climate change: Outlook for a more sustain- able and equitable future.ILRI Discussion Paper, 2024

Philip K Thornton, Eva K Wollenberg, and Laura K Cramer. Livestock and climate change: Outlook for a more sustain- able and equitable future.ILRI Discussion Paper, 2024. 1

work page 2024
[34]

Videomae: Masked autoencoders are data-efficient learners for self-supervised video pre-training

Zhan Tong, Yibing Song, Jue Wang, and Limin Wang. Videomae: Masked autoencoders are data-efficient learners for self-supervised video pre-training. InAdvances in Neural Information Processing Systems, pages 10078–10093, 2022. 5, 6

work page 2022
[35]

Repvit: Revisiting mobile cnn from vit perspective

Ao Wang, Hui Chen, Zijia Lin, Jungong Han, and Guiguang Ding. Repvit: Revisiting mobile cnn from vit perspective. InProceedings of the IEEE/CVF conference on computer vi- sion and pattern recognition, pages 15909–15920, 2024. 3, 6

work page 2024
[36]

Invisible gas detection: An rgb-thermal cross attention network and a new benchmark

Jue Wang, Yuxiang Lin, Qi Zhao, Dong Luo, Shuaibao Chen, Wei Chen, and Xiaojiang Peng. Invisible gas detection: An rgb-thermal cross attention network and a new benchmark. Computer Vision and Image Understanding, 248:104099,

work page
[37]

Depth- conditioned dynamic message propagation for monocular 3d object detection

Li Wang, Liang Du, Xiaoqing Ye, Yanwei Fu, Guodong Guo, Xiangyang Xue, Jianfeng Feng, and Li Zhang. Depth- conditioned dynamic message propagation for monocular 3d object detection. InProceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition, pages 454–463, 2020. 3

work page 2020
[38]

Infrared imaging detection for hazardous gas leakage using background information and improved yolo networks.Re- mote Sensing, 17(6):1030, 2025

Minghe Wang, Dian Sheng, Pan Yuan, Weiqi Jin, and Li Li. Infrared imaging detection for hazardous gas leakage using background information and improved yolo networks.Re- mote Sensing, 17(6):1030, 2025. 3

work page 2025
[39]

Research progress on methane emission reduction strategies for dairy cows.Dairy, 6(5): 48, 2025

Yu Wang, Kuan Chen, Shulin Yuan, Jianying Liu, Jianchao Guo, and Yongqing Guo. Research progress on methane emission reduction strategies for dairy cows.Dairy, 6(5): 48, 2025. 1

work page 2025
[40]

Segformer: Simple and efficient design for semantic segmentation with transform- ers.Advances in neural information processing systems, 34: 12077–12090, 2021

Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M Alvarez, and Ping Luo. Segformer: Simple and efficient design for semantic segmentation with transform- ers.Advances in neural information processing systems, 34: 12077–12090, 2021. 3, 6

work page 2021
[41]

Efficientsam: Leveraged masked image pretraining for efficient segment anything

Yunyang Xiong, Bala Varadarajan, Lemeng Wu, Xiaoyu Xi- ang, Fanyi Xiao, Chenchen Zhu, Xiaoliang Dai, Dilin Wang, Fei Sun, Forrest Iandola, et al. Efficientsam: Leveraged masked image pretraining for efficient segment anything. In Proceedings of the IEEE/CVF conference on computer vi- sion and pattern recognition, pages 16111–16121, 2024. 3

work page 2024
[42]

Mwirgas-yolo: Gas leakage detection based on mid-wave in- frared imaging.Sensors, 24(13), 2024

Shiwei Xu, Xia Wang, Qiyang Sun, and Kangjun Dong. Mwirgas-yolo: Gas leakage detection based on mid-wave in- frared imaging.Sensors, 24(13), 2024. 3

work page 2024
[43]

Repavit: Scalable vision transformer acceleration via structural reparameterization on feedforward network layers

Xuwei Xu, Yang Li, Yudong Chen, Jiajun Liu, and Sen Wang. Repavit: Scalable vision transformer acceleration via structural reparameterization on feedforward network layers. arXiv preprint arXiv:2505.21847, 2025. 3

work page arXiv 2025
[44]

Efficient transformer encoders for mask2former-style mod- els.arXiv preprint arXiv:2404.15244, 2024

Manyi Yao, Abhishek Aich, Yumin Suh, Amit Roy- Chowdhury, Christian Shelton, and Manmohan Chandraker. Efficient transformer encoders for mask2former-style mod- els.arXiv preprint arXiv:2404.15244, 2024. 3, 6

work page arXiv 2024
[45]

Shvit: Single-head vision transformer with memory efficient macro design

Seokju Yun and Youngmin Ro. Shvit: Single-head vision transformer with memory efficient macro design. InPro- ceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5756–5767, 2024. 6

work page 2024
[46]

Starcd-net: A remote sensing change detection method combining starnet and differential opera- tors.IEEE Access, 2025

Caijia Zeng, Xiaorong Xue, Chuanlu Li, Xingbiao Xu, Siyue Zhao, and Yifan Xu. Starcd-net: A remote sensing change detection method combining starnet and differential opera- tors.IEEE Access, 2025. 6

work page 2025
[47]

Dairy cow mastitis detection by thermal infrared images based on cle-unet.Animals, 13(13):2211, 2023

Qian Zhang, Ying Yang, Gang Liu, Yuanlin Ning, and Jian- quan Li. Dairy cow mastitis detection by thermal infrared images based on cle-unet.Animals, 13(13):2211, 2023. 2

work page 2023
[48]

Lactnet: A lightweight real-time semantic seg- mentation network based on an aggregated convolutional 10 neural network and transformer.Electronics, 13(12):2406,

Xiangyue Zhang, Hexiao Li, Jingyu Ru, Peng Ji, and Cheng- dong Wu. Lactnet: A lightweight real-time semantic seg- mentation network based on an aggregated convolutional 10 neural network and transformer.Electronics, 13(12):2406,

work page
[49]

Detection of res- piratory rate of dairy cows based on infrared thermography and deep learning.Agriculture, 13(10):1939, 2023

Kaixuan Zhao, Yijie Duan, Junliang Chen, Qianwen Li, Xing Hong, Ruihong Zhang, and Meijia Wang. Detection of res- piratory rate of dairy cows based on infrared thermography and deep learning.Agriculture, 13(10):1939, 2023. 2

work page 1939
[50]

iformer: Integrating convnet and transformer for mobile application.arXiv preprint arXiv:2501.15369, 2025

Chuanyang Zheng. iformer: Integrating convnet and transformer for mobile application.arXiv preprint arXiv:2501.15369, 2025. 3, 6

work page arXiv 2025
[51]

High-accuracy combustible gas cloud imaging system using yolo-plume classification net- work.Frontiers in Physics, V olume 13 - 2025, 2025

Jiani Zhou, Yang Liu, Yong Zhang, Haotian Hu, Zenan Leng, Feng Sun, and Chen Chen. High-accuracy combustible gas cloud imaging system using yolo-plume classification net- work.Frontiers in Physics, V olume 13 - 2025, 2025. 3 11

work page 2025