A Multi-purpose Tracking Framework for Salmon Welfare Monitoring in Challenging Environments
Pith reviewed 2026-05-21 21:05 UTC · model grok-4.3
The pith
A tracking framework using salmon body parts outperforms pedestrian trackers in underwater scenes and supports automated tail-beat welfare monitoring.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that a pose-estimation-based tracker augmented with body-part-specific modules can resolve salmon ID transfers in crowded scenes and ID switches during turns more reliably than existing pedestrian trackers, while simultaneously producing tracks accurate enough to compute tail-beat wavelength for welfare assessment.
What carries the argument
Pose estimation network that supplies bounding boxes and body-part locations, combined with specialized modules that exploit those locations to correct tracking errors.
If this is right
- The method outperforms BoostTrack on both salmon-specific tracking challenges.
- Body-part tracks enable direct calculation of tail-beat wavelength without additional detectors.
- A single pipeline can supply multiple welfare indicators instead of running separate systems for each metric.
- New datasets are provided for evaluating crowded-scene ID transfers and turning-induced ID switches.
Where Pith is reading between the lines
- The same body-part approach might transfer to tracking other fish species or marine animals in turbid water.
- Real-time deployment of the framework could support continuous welfare dashboards on large aquaculture farms.
- Extending the modules to additional body-part relations could further reduce ID switches in even denser scenes.
- Because the tracks already contain part-level detail, the system may lower overall compute cost compared with running independent detectors for each welfare indicator.
Load-bearing premise
The specialized modules can use body-part information to fix tracking problems without creating new errors or needing heavy per-scene tuning.
What would settle it
On the released datasets for ID transfer and ID switch challenges, a head-to-head run showing that the proposed tracker does not exceed BoostTrack performance or produces inaccurate tail-beat wavelengths would falsify the central claim.
Figures
read the original abstract
Computer Vision (CV)-based continuous, automated and precise salmon welfare monitoring is a key step toward reduced salmon mortality and improved salmon welfare in industrial aquaculture net pens. Available CV methods for determining welfare indicators focus on single indicators and rely on object detectors and trackers from other application areas to aid their welfare indicator calculation algorithm. This comes with a high resource demand for real-world applications, since each indicator must be calculated separately. In addition, the methods are vulnerable to difficulties in underwater salmon scenes, such as object occlusion, similar object appearance, and similar object motion. To address these challenges, we propose a flexible tracking framework that uses a pose estimation network to extract bounding boxes around salmon and their corresponding body parts, and exploits information about the body parts, through specialized modules, to tackle challenges specific to underwater salmon scenes. Subsequently, the high-detail body part tracks are employed to calculate welfare indicators. We construct two novel datasets assessing two salmon tracking challenges: salmon ID transfers in crowded scenes and salmon ID switches during turning. Our method outperforms the current state-of-the-art pedestrian tracker, BoostTrack, for both salmon tracking challenges. Additionally, we create a dataset for calculating salmon tail beat wavelength, demonstrating that our body part tracking method is well-suited for automated welfare monitoring based on tail beat analysis. Datasets and code are available at https://github.com/espenbh/BoostCompTrack.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a multi-purpose tracking framework for salmon in underwater net pens that first runs a pose estimator to produce per-salmon bounding boxes together with head, tail and body keypoints, then feeds these into a set of specialized modules that exploit part-level information to reduce ID transfers in crowded scenes and ID switches during turns. Two new tracking-challenge datasets and one tail-beat-wavelength dataset are introduced; the method is reported to outperform BoostTrack on the tracking tasks and to yield usable tail-beat measurements for welfare monitoring. Code and data are released.
Significance. If the performance gains can be attributed to the body-part modules and shown to be robust, the work would supply a single pipeline capable of supporting multiple welfare indicators, lowering the per-indicator overhead that currently limits deployment. The release of three domain-specific datasets and the open-source implementation are concrete contributions to applied multi-object tracking and aquaculture monitoring.
major comments (3)
- [§4.2] §4.2 and Table 2: the headline claim that the specialized modules resolve ID transfers and switches better than BoostTrack rests on an overall MOTA/IDF1 improvement, yet no ablation that removes the body-part modules (or replaces them with a plain SORT/ByteTrack baseline on the same pose boxes) is presented. Without this isolation it is impossible to verify that the reported gains arise from exploitation of tail/head keypoints rather than from other implementation details.
- [§3.3] §3.3: the description of the ID-transfer and turn-switch modules is high-level; no equations, pseudocode or decision thresholds are supplied for how part-keypoint consistency is used to reject or correct associations. This makes it difficult to assess whether the modules introduce new failure modes or require per-scene retuning.
- [§4.3] §4.3: the tail-beat-wavelength evaluation reports qualitative agreement with manual measurements but supplies neither quantitative error statistics (MAE, bias) nor an analysis of how tracking ID switches propagate into wavelength error. This weakens the claim that the body-part tracks are “well-suited” for automated welfare monitoring.
minor comments (2)
- [Figure 3] Figure 3 caption and §4.1: the color coding of tracks in the qualitative results is not explained, making it hard to verify the claimed reduction in ID switches.
- [Abstract] The abstract states outperformance “for both salmon tracking challenges” but does not quote the numerical margins; the reader must reach Table 2 to obtain them.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments, which highlight important aspects for strengthening the validation and reproducibility of our work. We address each major comment below and indicate the planned revisions.
read point-by-point responses
-
Referee: [§4.2] §4.2 and Table 2: the headline claim that the specialized modules resolve ID transfers and switches better than BoostTrack rests on an overall MOTA/IDF1 improvement, yet no ablation that removes the body-part modules (or replaces them with a plain SORT/ByteTrack baseline on the same pose boxes) is presented. Without this isolation it is impossible to verify that the reported gains arise from exploitation of tail/head keypoints rather than from other implementation details.
Authors: We agree that an explicit ablation isolating the body-part modules would provide clearer evidence for their contribution. The current comparison uses BoostTrack on the pose-derived boxes, but does not include a direct baseline with the specialized modules removed. In the revised manuscript we will add an ablation study that applies a standard SORT/ByteTrack tracker to the identical pose boxes, thereby quantifying the incremental benefit of the part-keypoint consistency logic. revision: yes
-
Referee: [§3.3] §3.3: the description of the ID-transfer and turn-switch modules is high-level; no equations, pseudocode or decision thresholds are supplied for how part-keypoint consistency is used to reject or correct associations. This makes it difficult to assess whether the modules introduce new failure modes or require per-scene retuning.
Authors: We acknowledge that the current description in §3.3 remains conceptual. To improve reproducibility and allow evaluation of potential failure modes or tuning requirements, we will include pseudocode for both modules together with the concrete decision thresholds (keypoint distance and orientation consistency criteria) in the revised version. revision: yes
-
Referee: [§4.3] §4.3: the tail-beat-wavelength evaluation reports qualitative agreement with manual measurements but supplies neither quantitative error statistics (MAE, bias) nor an analysis of how tracking ID switches propagate into wavelength error. This weakens the claim that the body-part tracks are “well-suited” for automated welfare monitoring.
Authors: The §4.3 evaluation was designed to demonstrate practical feasibility on real underwater footage. We agree that quantitative error metrics would strengthen the welfare-monitoring claim. In revision we will report MAE and bias relative to the manual annotations. A full propagation analysis of ID-switch effects on wavelength error will be added where the existing annotations permit; otherwise we will explicitly discuss this as a limitation of the current dataset. revision: partial
Circularity Check
No significant circularity; empirical framework evaluated on novel datasets
full rationale
The paper proposes a tracking framework that extends existing pose estimation and tracking methods (e.g., comparison to external BoostTrack baseline) with specialized modules for salmon-specific challenges, then evaluates performance via newly constructed datasets for ID transfers, ID switches, and tail-beat analysis. No load-bearing derivations, equations, or predictions reduce by construction to fitted inputs, self-definitions, or self-citation chains. The central claims are direct empirical comparisons on external benchmarks and new data, rendering the approach self-contained without circular reductions.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption A pose estimation network trained on salmon images can provide reliable body part detections in real underwater farm conditions.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
uses a pose estimation network to extract bounding boxes around salmon and their corresponding body parts, and exploits information about the body parts, through specialized modules, to tackle challenges specific to underwater salmon scenes
-
IndisputableMonolith/Foundation/ArithmeticFromLogic.leanLogicNat induction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
The TurnModule determines whether a salmon is turning by evaluating a counter c... The CrowdedModule checks whether the small body parts associations suggest that the initial salmon bounding box association needs correction
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
William I Atlas, Sami Ma, Yi Ching Chou, Katrina Connors, Daniel Scurfield, Brandon Nam, Xiaoqiang Ma, Mark Cleve- land, Janvier Doire, Jonathan W Moore, et al. Wild salmon enumeration and monitoring using deep learning empowered detection and tracking.Frontiers in Marine Science, 10: 1200408, 2023. 2
work page 2023
-
[2]
Deep learning for automated shark detection and biometrics without key- points
Jaden Clark, Chinmay Lalgudi, Mark Leone, Jayson Meribe, Sergio Madrigal-Mora, and Mario Espinoza. Deep learning for automated shark detection and biometrics without key- points. InComputer Vision – ECCV 2024 Workshops, pages 105–120, Cham, 2025. Springer Nature Switzerland. 2, 3
work page 2024
-
[3]
Mot20: A benchmark for multi object tracking in crowded scenes.arXiv preprint arXiv:2003.09003, 2020
Patrick Dendorfer, Hamid Rezatofighi, Anton Milan, Javen Shi, Daniel Cremers, Ian Reid, Stefan Roth, Konrad Schindler, and Laura Leal-Taix ´e. Mot20: A benchmark for multi object tracking in crowded scenes.arXiv preprint arXiv:2003.09003, 2020. 3, 4
-
[4]
Strongsort: Make deep- sort great again.IEEE Transactions on Multimedia, 25: 8725–8737, 2023
Yunhao Du, Zhicheng Zhao, Yang Song, Yanyun Zhao, Fei Su, Tao Gong, and Hongying Meng. Strongsort: Make deep- sort great again.IEEE Transactions on Multimedia, 25: 8725–8737, 2023. 3
work page 2023
-
[5]
Accurate wound and lice detection in atlantic salmon fish using a convolutional neural network
Aditya Gupta, Even Bringsdal, Kristian Muri Knausg ˚ard, and Morten Goodwin. Accurate wound and lice detection in atlantic salmon fish using a convolutional neural network. Fishes, 7(6):345, 2022. 1, 2, 3
work page 2022
-
[6]
Kaiming He, Georgia Gkioxari, Piotr Doll ´ar, and Ross Gir- shick. Mask r-cnn. InProceedings of the IEEE international conference on computer vision, pages 2961–2969, 2017. 2
work page 2017
-
[7]
Espen Berntzen Høgstedt, Christian Schellewald, Rudolf Mester, and Annette Stahl. Automated computer vision based individual salmon (salmo salar) breathing rate estima- tion (sabre) for improved state observability.Aquaculture, 595:741535, 2025. 1, 2, 3
work page 2025
-
[8]
(mp)2t: Multiple people multiple parts tracker
Hamid Izadinia, Imran Saleemi, Wenhui Li, and Mubarak Shah. (mp)2t: Multiple people multiple parts tracker. In Computer Vision – ECCV 2012, pages 100–114. Springer Berlin Heidelberg, 2012. 3
work page 2012
-
[9]
Glenn Jocher, Ayush Chaurasia, and Jing Qiu. Ultralytics yolov8, 2023. 4
work page 2023
-
[10]
Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer White- head, Alexander C Berg, Wan-Yen Lo, et al. Segment any- thing. InProceedings of the IEEE/CVF international confer- ence on computer vision, pages 4015–4026, 2023. 1, 2
work page 2023
-
[11]
Jessy Lauer, Mu Zhou, Shaokai Ye, William Menegas, Stef- fen Schneider, Tanmay Nath, Mohammed Mostafizur Rah- man, Valentina Di Santo, Daniel Soberanes, Guoping Feng, et al. Multi-animal pose estimation, identification and track- ing with deeplabcut.Nature Methods, 19(4):496–504, 2022. 2, 3, 4
work page 2022
-
[12]
Ziyi Liu, Xian Li, Liangzhong Fan, Huanda Lu, Li Liu, and Ying Liu. Measuring feeding activity of fish in ras using computer vision.Aquacultural engineering, 60:20–27, 2014. 1
work page 2014
-
[13]
Zelin Liu, Xinggang Wang, Cheng Wang, Wenyu Liu, and Xiang Bai. Sparsetrack: Multi-object tracking by performing scene decomposition based on pseudo-depth.IEEE Transac- tions on Circuits and Systems for Video Technology, 2025. 3
work page 2025
-
[14]
Trackeval.https:// github.com/JonathonLuiten/TrackEval, 2020
Jonathon Luiten and Arne Hoffhues. Trackeval.https:// github.com/JonathonLuiten/TrackEval, 2020. 5
work page 2020
-
[15]
Trym Anthonsen Nyg ˚ard, Jan Henrik Jahren, Christian Schellewald, and Annette Stahl. Motion trajectory estima- tion of salmon using stereo vision.IFAC-PapersOnLine, 55 (31):363–368, 2022. 2
work page 2022
-
[16]
Sleap: A deep learning system for multi-animal pose track- ing.Nature methods, 19(4):486–495, 2022
Talmo D Pereira, Nathaniel Tabris, Arie Matsliah, David M Turner, Junyu Li, Shruthi Ravindranath, Eleni S Papadoyan- nis, Edna Normand, David S Deutsch, Z Yan Wang, et al. Sleap: A deep learning system for multi-animal pose track- ing.Nature methods, 19(4):486–495, 2022. 2, 3
work page 2022
-
[17]
You only look once: Unified, real-time object de- tection
Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. You only look once: Unified, real-time object de- tection. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 779–788, 2016. 2
work page 2016
-
[18]
Stereoyolo+ deepsort: a framework to track fish from underwater stereo camera in situ
Aya Saad, Stian Jakobsen, Morten Bondø, Mats Mulelid, and Eleni Kelasidi. Stereoyolo+ deepsort: a framework to track fish from underwater stereo camera in situ. InSixteenth International Conference on Machine Vision (ICMV 2023), pages 321–329. SPIE, 2024. 2
work page 2023
-
[19]
Yuto Sasaki, Rin Nishikawa, and Kazuyoshi Komeyama. Non-invasive swimming speed estimation method based on tail-beat frequency determined from fish length measurement using stereo-cameras.Fisheries Science, 90(6):1001–1010,
-
[20]
Christian Schellewald, Aya Saad, and Annette Stahl. Mouth opening frequency of salmon from underwater video exploit- ing computer vision.IFAC-PapersOnLine, 58(20):313–318,
-
[21]
15th IFAC Conference on Control Applications in Ma- rine Systems, Robotics and Vehicles CAMS 2024. 2, 3
work page 2024
-
[22]
Adaptrack: Adaptive thresholding-based matching for multi-object tracking
Kyujin Shim, Kangwook Ko, Jubi Hwang, and Changick Kim. Adaptrack: Adaptive thresholding-based matching for multi-object tracking. In2024 IEEE International Confer- ence on Image Processing (ICIP), pages 2222–2228. IEEE,
-
[23]
Part-based multiple-person tracking with partial occlusion handling
Guang Shu, Afshin Dehghan, Omar Oreifej, Emily Hand, and Mubarak Shah. Part-based multiple-person tracking with partial occlusion handling. In2012 IEEE Conference on Computer Vision and Pattern Recognition, pages 1815–
-
[24]
Vukasin D Stanojevic and Branimir T Todorovic. Boost- track: boosting the similarity measure and detection confi- dence for improved multiple object tracking.Machine Vision and Applications, 35(3):1–15, 2024. 3, 4, 6
work page 2024
-
[25]
Measuring tail beat frequency and coast phase in school of fish for collective motion analysis
Kei Terayama, Hirohisa Hioki, and Masa-aki Sakagami. Measuring tail beat frequency and coast phase in school of fish for collective motion analysis. InEighth Interna- tional Conference on Graphic and Image Processing (ICGIP 2016), pages 349–356. SPIE, 2017. 3
work page 2016
-
[26]
Tong Tong, Xu Yang, Fukun Gui, Jiajun Hu, Shuai Niu, Lianghao Tang, Hengda Huang, and Yucheng Jiang. The influence of simulated pressure changes on the behavior of larimichthys crocea during the deep sea submarine descent of net cages.Frontiers in Marine Science, 11:1402762, 2024. 3
work page 2024
-
[27]
Pauli Virtanen, Ralf Gommers, Travis E. Oliphant, Matt Haberland, Tyler Reddy, David Cournapeau, Evgeni Burovski, Pearu Peterson, Warren Weckesser, Jonathan Bright, St ´efan J. van der Walt, Matthew Brett, Joshua Wil- son, K. Jarrod Millman, Nikolay Mayorov, Andrew R. J. Nelson, Eric Jones, Robert Kern, Eric Larson, C J Carey, ˙Ilhan Polat, Yu Feng, Eric ...
work page 2020
-
[28]
Smiletrack: Simi- larity learning for occlusion-aware multiple object tracking
Yu-Hsiang Wang, Jun-Wei Hsieh, Ping-Yang Chen, Ming- Ching Chang, Hung-Hin So, and Xin Li. Smiletrack: Simi- larity learning for occlusion-aware multiple object tracking. InProceedings of the AAAI Conference on Artificial Intelli- gence, pages 5740–5748, 2024. 3
work page 2024
-
[29]
Towards real-time multi-object tracking
Zhongdao Wang, Liang Zheng, Yixuan Liu, Yali Li, and Shengjin Wang. Towards real-time multi-object tracking. In European conference on computer vision, pages 107–122. Springer, 2020. 2
work page 2020
-
[30]
Simple online and realtime tracking with a deep association metric
Nicolai Wojke, Alex Bewley, and Dietrich Paulus. Simple online and realtime tracking with a deep association metric. In2017 IEEE International Conference on Image Processing (ICIP), pages 3645–3649, 2017. 3
work page 2017
-
[31]
Gang Xiao, Min Feng, Zhenbo Cheng, Meirong Zhao, Jiafa Mao, and Luke Mirowski. Water quality monitoring using abnormal tail-beat frequency of crucian carp.Ecotoxicology and Environmental Safety, 111:185–191, 2015. 1, 3
work page 2015
-
[32]
Computer vision-based detection and tracking of fish in aquaculture environments
Giovanni Zebele. Computer vision-based detection and tracking of fish in aquaculture environments. Bachelor’s the- sis, University of Padova, 2022. 2
work page 2022
-
[33]
Lu Zhang, Jianping Wang, and Qingling Duan. Estimation for fish mass using image analysis and neural network.Com- puters and Electronics in Agriculture, 173:105439, 2020. 2
work page 2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.